Flux Diffusers Pipeline's unusual runtime in Google colab

Gencause · September 27, 2024, 8:27am

Hi,

I tried to to run the guided distilled diffusion pipeline of flux as given in Flux

It seems the following part is taking forever to run in Google colab.

prompt = “a tiny astronaut hatching from an egg on the moon”

out = pipe(
prompt=prompt,
guidance_scale=3.5,
height=768,
width=1360,
num_inference_steps=50,
).images[0]
out.save(“image.png”)

Kindly advise.

Thanks.

John6666 · September 27, 2024, 8:38am

Google Colab does not seem to be very welcoming of its use in image generation, so there could be various causes, but so far it seems that Flux generation is usually possible, so it is probably not a restriction or something.

Assuming you are using the Flux dev version, the only problems with running the free version of Colab are width and num_inference_steps. Try reducing these.

If it is definitively wrong, you will get an error, but if the computer decides that it might be able to run it over time, it will happen forever that way.
Also, if there is insufficient RAM, VRAM, or disk space, it may stop without an error, and the action in this case is similar.

Gencause · September 27, 2024, 8:52am

Thanks for the advise.
I’m using num_steps =10 and img dims 1024 x 1024.

John6666 · September 27, 2024, 10:44am

Could it be the free version of Colab?
If so, with 16GB of VRAM, it will take endless hours to generate, I don’t even know if it will take an hour.

I’m sorry for the Japanese page, but you can try it with quantized flux as shown in the following page.
I think there is an English version of the know-how if you search for it.

nielsr · September 27, 2024, 4:36pm

Hi,

The reason is because by default it runs on CPU. Try adding pipe.to("cuda") before running it.

Gencause · September 27, 2024, 10:38pm

Hi,

Thanks a lot for the suggestions. Actually, as it turns out Colab does not always offer GPU support as it does for CPU support. It should work when and if it does. I was wondering if we could make any fundamental changes in the class FlowMatchEulerDiscreteSchedulerOutput of the function scheduling_flow_match_euler_discrete.

John6666 · September 28, 2024, 12:22am

Is this a strategy to dynamically switch the use of CUDA?

The scheduler is not a regular iterator or anything like that, it’s in charge of the noise removal schedule, so unless you know a lot about it, messing with it will ruin the image. Strictly speaking, it is somewhat different, but in WebUI and ComfyUI, it is a sampler.

Maybe, better yet, determine if you can use CUDA just before inference, and if not, give up.
You should be able to determine that with torch.cuda.available().

It is not realistic to infer a Flux model using only CPU, at least for now, let alone an SD 1.5 model.

It is possible to at least offload some tensors to RAM when VRAM is insufficient, but it is enough to use pipe.enable_cpu_offloading() for that.

Gencause · September 28, 2024, 2:59am

No. It has something to do with the step of the number of steps taken by the model during inference/sampling.

John6666 · September 28, 2024, 3:49am

So you want to speed up the inference itself? Then wouldn’t it be easier to use Hyper Flux or some other such thing? The page below says SD, but you can also find SDXL and Flux’s LoRA there.

Also, there are a couple of techniques recently proposed in HF that may not be as easy to use as Hyper.

ankitom · January 29, 2025, 11:27am

Thanks for this wonderful tip. It saves a lot of time.

Topic		Replies	Views
Stable Diffusion on Tpu using Colab Flax/JAX Projects	1	1878	March 1, 2023
How long does image generation with black-forest-labs/FLUX.1-dev take? Models	4	96	July 22, 2025
How to fit Versatile Diffusion into colab RAM? 🧨 Diffusers	0	469	November 24, 2022
Flux.1-dev installation Models	1	3081	August 31, 2024
Endless fetching of files while initialising pretrained pipeline 🧨 Diffusers	2	1811	January 4, 2024

Flux Diffusers Pipeline's unusual runtime in Google colab

Related topics