Flux.1 [schnell] is too slow

Hello! When I use Flux.1 with diffusers library it gives me this warning:

You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers

and also runs extremely slowly. What’s the problem? I tried using both bf16 and fp16, but it has no effect whatsoever.
Here’s my code:

import torch
from diffusers import FluxPipeline

pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-schnell", torch_dtype=torch.bfloat16)
pipe.enable_model_cpu_offload()

prompt = "Draw a 2D red ball with no background, without shadows"
image = pipe(
    prompt,
    height=512,
    width=512,
    guidance_scale=3.5,
    num_inference_steps=40,
    max_sequence_length=512,
    generator=torch.Generator("cpu").manual_seed(0)
).images[0]
image.save("flux.png")
1 Like

Try disabling CPU offloading and keep everything on the GPU if you have enough GPU memory available.

#pipe.enable_model_cpu_offload()
1 Like

Now it’s even slower. My CPU has 60% usage, however GPU has 0%. What’s going on?

1 Like

I think that the GPU version of torch may not have been installed correctly, and that the CPU version may have been installed instead.

I’ve installed torch with this command:

pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu126
1 Like

How do I check if I have the right version installed?

1 Like

How do I check if I have the right version installed?

import torch
print(torch.cuda.is_available())

I’ve installed torch with this command:

I recommend this.

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
  1. It prints “True”.
  2. I have Cuda 12.6
1 Like

Try this.

pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-schnell", torch_dtype=torch.bfloat16).to("cuda")
#pipe.enable_model_cpu_offload()

And if you don’t install it yet.

pip install -U accelerate

Is 226.04s/it ok? I have GTX 1650.

1 Like

Even the RTX 4090 isn’t enough…

Oh, ok. Sorry for being stupid :sweat_smile: I’m very new to AI

1 Like

Are there lightweight models that can generate simple 2D art fastly?

1 Like

FLUX is the latest model, so it’s very large and requires an abnormal amount of VRAM. SD1.5 is 2GB, SDXL is 6.5GB, and FLUX is around 32GB. You can reduce memory consumption using a technique called quantization, but at most you can only reduce it by a quarter.
Well, that’s why many people try to use cloud services rather than their own PCs.

Are there lightweight models that can generate simple 2D art fastly?

SD1.5 or SDXL ones. For anime, SDXL is recommended by me.

one of SDXL model

1 Like

Thanks! I should’ve read that before installing such a large model… :smiling_face_with_tear:

1 Like

A lot of people are surprised by the size of FLUX. I’m poor with GPUs, so it won’t run locally!:sob:

Spaces is free to use unless you get scammed, so you should try out different things and find a compromise.
If you’re okay with SDXL, then as long as you have a video card with 12GB VRAM, it should work. Even with 8GB, you can manage if you try hard.

1 Like

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.