Notable differences between other implementations of stable diffusion, particularly in the img2img pipeline

tonetechnician · October 18, 2022, 8:18pm

Hey there!

I’ve been doing some extensive tests between diffuser’s stable diffusion and AUTOMATIC1111’s and NMKD-SD-GUI implementations (which both wrap the CompVis/stable-diffusion repo). I wanted to report some observations and wondered if the community might be able to shed some light on the findings.

For DDIM, I see that the output using the same configuration (20 steps, 7.5 CFG, 0 seed), I get different output. (Images to follow in posts)

For LMS, I see that in txt2img the output is exactly the same (20 steps, 7.5 CFG, 0 seed), but when moving to img2img, the output is very different, and most notably, it seems there is some smoothing or something happening in diffusers that causes the output to lose crispness. (Images to follow in posts)

Looking at this, it leads me to believe there is some underlying change in the parameters being fed to the algorithm/architecture of the algorithm. I find it strange the LMS gives the same output for txt2img but different output for img2img and that leads me to believe potentially there is a change in the VAE part of the model architecture in diffusers to that of CompVis’. I’ve noticed that there quite noticeable differences between diffusers and the regular stable-diffusion inference model stable-diffusion/v1-inference.yaml at main · CompVis/stable-diffusion · GitHub

Would love it if someone more knowledgeable might be able to share some more light on this! Personally for img2img, I find that Automatic’s implementation looks alot crisper and more natural. It seems to break from the form of the original image a bit more.

tonetechnician · October 18, 2022, 8:19pm

Here are image references, due to new user status, I wasn’t able to add them all in a single post.

For DDIM, I see that the output using the same configuration (20 steps, 7.5 CFG, 0 seed), I get different output. Shown below

DIFFUSERS DDIM txt2img:

tonetechnician · October 18, 2022, 8:20pm

AUTOMATIC1111 DDIM txt2img:

tonetechnician · October 18, 2022, 8:21pm

For LMS, I see that in txt2img the output is exactly the same (20 steps, 7.5 CFG, 0 seed), but when moving to img2img, the output is very different, and most notably, it seems there is some smoothing or something happening in diffusers that causes the output to lose crispness.

DIFFUSERS LMS txt2img:

tonetechnician · October 18, 2022, 8:22pm

Here as we can see, the output is exactly the same as Diffusers which is what I would expect for all the schedulers

AUTOMATIC1111 DDIM txt2img:

tonetechnician · October 18, 2022, 8:23pm

For img2img, LMS does not have the same result. (20 steps, 7.5 CFG, 0 seed, 0.5 Strength)

DIFFUSERS LMS img2img:

tonetechnician · October 18, 2022, 8:24pm

AUTOMATIC1111 LMS img2img:

Very different, and seems to me Automatic has a more resolved image

tonetechnician · October 20, 2022, 5:06pm

sorry, there was a type here - it should “AUTOMATIC1111 LMS txt2img”

mishmish-g · October 23, 2022, 12:30pm

I have the same problem, I’m using DDIM sampler with the StableDiffusion CompVis repo and getting different results then with loading the model with the diffusers library.
I went over both model configs and couldn’t find any difference. I also made sure the same model ckpt is used.
Would really appreciate some help on that!

tnarek · October 23, 2022, 12:34pm

same issue here of getting different outputs with diffusers and CompVis models under the same config.

rauln · July 3, 2023, 2:44pm

Same issue for me. Have exactly the same settings in Automatic as in diffusers but still get completely different results

Topic		Replies	Views
Stable diffusion img2img: Continue from saved image 🧨 Diffusers	3	3274	September 23, 2022
Img2img How is training and inference different from text2img 🧨 Diffusers	0	1761	October 4, 2023
AutoPipelineForText2Image vs DiffusionPipeline Beginners	0	165	April 25, 2024
SDXL img2img controlnet not in release 0.20.2? 🧨 Diffusers	1	607	September 4, 2023
How much does the initial noise influence the output quality? 🧨 Diffusers	2	217	September 8, 2024

Notable differences between other implementations of stable diffusion, particularly in the img2img pipeline

Related topics