I run below code on a RTX 3090 with Ryzen 9 7900X and 128 GB RAM. So generating a single 512x512 image takes 20 minutes.
Is that normal? I read that it just should take seconds.
import torch
from diffusers import FluxPipeline
import sys
import time
start = time.time()
print("CUDA available:", torch.cuda.is_available())
print("Device:", torch.cuda.get_device_name(0) if torch.cuda.is_available() else "CPU")
pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16)
pipe.to("cuda")
prompt = "a wolf running"
images_ = pipe(
prompt,
# width=1920,
# height=1088,
width=512,
height=512,
guidance_scale=3.5,
num_inference_steps=50,
max_sequence_length=512,
generator=torch.Generator(device="cuda").manual_seed(0)
).images
for i, image in enumerate(images_):
image.save("flux-dev" + str(i) + ".png")
end = time.time()
print(f"Generation took {time.time() - start:.2f} seconds")
Cuda is 12.1, PYthon is 3.10
Packages (installed version | lastest version):
GitPython | 3.1.44 | 3.1.44 |
---|---|---|
MarkupSafe | 2.1.5 | 3.0.2 |
PyYAML | 6.0.2 | 6.0.2 |
accelerate | 1.9.0 | 1.9.0 |
aiofiles | 23.2.1 | 24.1.0 |
altair | 5.5.0 | 5.5.0 |
annotated-types | 0.7.0 | 0.7.0 |
anyio | 4.9.0 | 4.9.0 |
attrs | 25.3.0 | 25.3.0 |
blinker | 1.9.0 | 1.9.0 |
cachetools | 6.1.0 | 6.1.0 |
certifi | 2025.7.14 | 2025.7.14 |
charset-normalizer | 3.4.2 | 3.4.2 |
click | 8.2.1 | 8.2.1 |
colorama | 0.4.6 | 0.4.6 |
diffusers | 0.34.0 | 0.34.0 |
einops | 0.8.1 | 0.8.1 |
exceptiongroup | 1.3.0 | 1.3.0 |
fastapi | 0.116.1 | 0.116.1 |
ffmpy | 0.6.0 | 0.6.0 |
filelock | 3.18.0 | 3.18.0 |
fire | 0.7.0 | 0.7.0 |
flux | 0.0.post58+g1371b2b | 1.3.5 |
fsspec | 2025.7.0 | 2025.7.0 |
gitdb | 4.0.12 | 4.0.12 |
gradio | 5.13.2 | 5.38.0 |
gradio-client | 1.6.0 | 1.11.0 |
h11 | 0.16.0 | 0.16.0 |
httpcore | 1.0.9 | 1.0.9 |
httpx | 0.28.1 | 0.28.1 |
huggingface-hub | 0.33.4 | 0.33.4 |
idna | 3.10 | 3.10 |
importlib-metadata | 8.7.0 | 8.7.0 |
invisible-watermark | 0.2.0 | 0.2.0 |
jinja2 | 3.1.6 | 3.1.6 |
jsonschema | 4.25.0 | 4.25.0 |
jsonschema-specifications | 2025.4.1 | 2025.4.1 |
markdown-it-py | 3.0.0 | 3.0.0 |
mdurl | 0.1.2 | 0.1.2 |
mpmath | 1.3.0 | 1.3.0 |
narwhals | 1.48.0 | 1.48.0 |
networkx | 3.4.2 | 3.5 |
numpy | 2.2.6 | 2.3.1 |
opencv-python | 4.12.0.88 | 4.12.0.88 |
orjson | 3.11.0 | 3.11.0 |
packaging | 25.0 | 25.0 |
pandas | 2.3.1 | 2.3.1 |
pillow | 11.3.0 | 11.3.0 |
pip | 25.1.1 | 25.1.1 |
protobuf | 6.31.1 | 6.31.1 |
psutil | 7.0.0 | 7.0.0 |
pyarrow | 21.0.0 | 21.0.0 |
pydantic | 2.11.7 | 2.11.7 |
pydantic-core | 2.33.2 | |
pydeck | 0.9.1 | 0.9.1 |
pydub | 0.25.1 | 0.25.1 |
pygments | 2.19.2 | 2.19.2 |
python-dateutil | 2.9.0.post0 | 2.9.0.post0 |
python-multipart | 0.0.20 | 0.0.20 |
pytz | 2025.2 | 2025.2 |
pywavelets | 1.8.0 | 1.8.0 |
referencing | 0.36.2 | 0.36.2 |
regex | 2024.11.6 | 2024.11.6 |
requests | 2.32.4 | 2.32.4 |
rich | 14.0.0 | 14.0.0 |
rpds-py | 0.26.0 | 0.26.0 |
ruff | 0.6.8 | 0.12.4 |
safehttpx | 0.1.6 | 0.1.6 |
safetensors | 0.5.3 | 0.5.3 |
semantic-version | 2.10.0 | 2.10.0 |
sentencepiece | 0.2.0 | 0.2.0 |
setuptools | 57.4.0 | 80.9.0 |
shellingham | 1.5.4 | 1.5.4 |
six | 1.17.0 | 1.17.0 |
smmap | 5.0.2 | 6.0.0 |
sniffio | 1.3.1 | 1.3.1 |
starlette | 0.47.2 | 0.47.2 |
streamlit | 1.47.0 | 1.47.0 |
streamlit-drawable-canvas | 0.9.3 | 0.9.3 |
streamlit-keyup | 0.3.0 | 0.3.0 |
sympy | 1.13.1 | 1.14.0 |
tenacity | 9.1.2 | 9.1.2 |
termcolor | 3.1.0 | 3.1.0 |
tokenizers | 0.21.2 | 0.21.2 |
toml | 0.10.2 | 0.10.2 |
tomlkit | 0.13.3 | 0.13.3 |
torch | 2.5.1+cu121 | 2.7.1 |
torchaudio | 2.5.1+cu121 | 2.7.1 |
torchvision | 0.20.1+cu121 | 0.22.1 |
tornado | 6.5.1 | 6.5.1 |
tqdm | 4.67.1 | 4.67.1 |
transformers | 4.53.2 | 4.53.2 |
typer | 0.16.0 | 0.16.0 |
typing-extensions | 4.14.1 | 4.14.1 |
typing-inspection | 0.4.1 | 0.4.1 |
tzdata | 2025.2 | 2025.2 |
urllib3 | 2.5.0 | 2.5.0 |
uvicorn | 0.35.0 | 0.35.0 |
watchdog | 6.0.0 | 6.0.0 |
websockets | 14.2 | 15.0.1 |
zipp | 3.23.0 | 3.23.0 |