Why is diffusers so much slower than ComfyUI?

Hi,

I’m new to Stable Diffusion and currently try to gain some understanding by using diffusers and ComfyUi. What puzzles me is the performance difference between diffusers and ComfyUI. Running the example code below, one iteration takes about 15s. Using ComfyUI one iteration takes about 1s. (In both pipelines I’m using 1024x1024 as resolution.)

Any idea why there might be such a huge difference?

I tried some of the hints given in the OPTIMIZATION/SPECIAL HARDWARE chapter in the Docs

torch.compile does not work under Windows
xFormers also does not give the expected boost
Means, there is no difference between using

enable_xformers_memory_efficient_attention()

and not doing so.

from diffusers import StableDiffusionXLPipeline
import torch

pipe = StableDiffusionXLPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, variant="fp16", use_safetensors=True
)
pipe.to("cuda")
pipe.enable_xformers_memory_efficient_attention()

prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipe(prompt=prompt).images[0]

So any hint is appreciated.

AlfaPapa

Got it :grinning:

enable_model_cpu_offload()

really boosts it! In my case from 15s/it to 1s/it

See Speed up inference

Im on a laptop 3060 and its incredibly slow. Is anybody still here , can I get some help with how to do this please

Xformers really makes the difference here. On my local machine with only 8 GBs of VRAM this one line:

pipe.enable_vae_tiling()

reduced inference time from 117 seconds to 40 seconds.