Why is diffusers so much slower than ComfyUI?

Hi,

I’m new to Stable Diffusion and currently try to gain some understanding by using diffusers and ComfyUi. What puzzles me is the performance difference between diffusers and ComfyUI. Running the example code below, one iteration takes about 15s. Using ComfyUI one iteration takes about 1s. (In both pipelines I’m using 1024x1024 as resolution.)

Any idea why there might be such a huge difference?

I tried some of the hints given in the OPTIMIZATION/SPECIAL HARDWARE chapter in the Docs

torch.compile does not work under Windows
xFormers also does not give the expected boost
Means, there is no difference between using

enable_xformers_memory_efficient_attention()

and not doing so.

from diffusers import StableDiffusionXLPipeline
import torch

pipe = StableDiffusionXLPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, variant="fp16", use_safetensors=True
)
pipe.to("cuda")
pipe.enable_xformers_memory_efficient_attention()

prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipe(prompt=prompt).images[0]

So any hint is appreciated.

AlfaPapa

Got it :grinning:

enable_model_cpu_offload()

really boosts it! In my case from 15s/it to 1s/it

See Memory and speed