I have been playing with the diffuser for a while and feel interested.
However, recently I have been confused by why StableDiffusionImg2ImgPipeline has a “strength” parameter, but StableDiffusionInpaintPipeline does not. Based on my understanding, they are working on a similar principle.
Therefore, I researched the diffuser and AUTOMATIC1111 issues and found these two issues.
- Inpainting example is using data from the masked parts with strength 1 · Issue #261 · huggingface/diffusers · GitHub
- [Bug]: "Inpainting conditioning mask strength" value between >0 and <1 results in desaturated colors · Issue #5557 · AUTOMATIC1111/stable-diffusion-webui · GitHub
It seems like AUTOMATIC1111 supports the “strength” parameter, and the previous diffuser does the same.
Does anyone know why the diffuser does not support the “strength” parameter in StableDiffusionInpaintPipeline?
Hey @bernie40916 thanks for the post!
In the img2img case, the strength parameter determines how much noise is added to the image that is being varied. That noised image is used as the only latent input to the unet.
In the inpainting case, the latents are composed of 1. random noise 2. the mask 3. the latents of the masked image being inpainted.
We don’t have the ability to determine how much noise is added to 1 (which would be equivalent to the img2img case). I’m not sure if noising 2 is sound. We don’t add any noise to 3. We could choose to add some amount of noise to 3 but I don’t know how sound that is or if it gives expected results.
If you’re interested in exploring if noising the masked image provides any benefits, feel free to cite existing work or do some experimentation demonstrating the effect. Happy to merge a PR adding noise via a strength parameter if it shows promise