Zero GPU Worker Error

clownscape · August 10, 2025, 12:24am

[Did perform a search. The only other thread discussing a similar topic was deleted by the user so here goes.]

For the last few days while using ZeroGPU to generate content, I was getting quite a lot of “ZeroGPU worker error RuntimeError”-s. It was eating up a lot of my daily 25 minutes generation quota, but still, with intermittent retries I could at least finish the generation. However, since today, the Zero GPU Worker Errors have almost become instantanous on clicking the generate button, making it impossible to generate anything at all.

Can anyone tell me if they are/were facing similar issues? And if there is any workaround to the issue?

I was mostly using this module -

John6666 · August 10, 2025, 12:34am

I was able to reproduce the error. I thought it might be a bug, so I duplicated the space, but it worked without any corrections…

What is this… @hysts ?

Duplicated

Original

clownscape · August 10, 2025, 12:52am

I tried Wan 2.2 5B - a Hugging Face Space by Wan-AI

Still getting the same error. So, maybe not particular to the above-stated space, but all similar spaces using ZeroGPU.

hysts · August 10, 2025, 4:21am

Hmm, I’ not sure what the cause of the error is.

Looks like the Space was a modified version of Wan2.1 Fast - a Hugging Face Space by multimodalart , which is working fine.

The Space is not pinning dependency versions, so it might be caused by updates on dependencies.

I’ve asked internally about it, but it’s summer vacation season and many people are away, so we might not be able to get an answer right away.

mehdi2ittta · August 10, 2025, 4:41am

I have the exact same problem and duplicating the space didn’t work for me.
The problem is not with this space. So far, every zero gpu spaces I’ve tested has had the same problem.

John6666 · August 10, 2025, 5:15am

Oh. Even with hysts, it’s hard to figure out the cause right away…
If I can reproduce the bug in my own environment, I can try some trial and error…

mehdi2ittta · August 10, 2025, 5:23am

I was able to fix the issue on my Space by adjusting the requirements.txt.

Try adding these two lines to your requirements.txt and restart the Space:

safetensors
sentencepiece

hysts · August 10, 2025, 5:23am

Which Spaces have you tried?
I get the same error in the Spaces mentioned above, but other than that, I can’t reproduce it.
I’ve tried duplicating some ZeroGPU Spaces, but they worked fine.
For example, these Spaces work fine.

John6666 · August 10, 2025, 5:24am

I’m in the same situation as hysts.

anucocogen · August 10, 2025, 6:55am

I am encountering the same error with Flux Kontext 1.1 Dev

hysts · August 10, 2025, 7:10am

OK, I’ve just restarted FLUX.1 Kontext - a Hugging Face Space by black-forest-labs and it’s back up now. It’s probably unrelated to the issue discussed in this thread.

John6666 · August 10, 2025, 8:17am

Looking at the code, I found a change in Gradio’s specifications.
I can’t believe it, but could Gradio’s repeated specification changes be causing code conflicts…?

Gradio 4

cache_examples="lazy",

Gradio 5

cache_examples=True,
cache_mode="lazy",

Gradio 5 TODAY

cache_examples="lazy",

Edit:
It was a misunderstanding. It seems that the previous notation was originally possible for compatibility reasons.

hysts · August 10, 2025, 9:05am

No, I don’t think so. The error in the log is related to CUDA. Also, if it’s about caching examples, the Space would probably not be able to launch.

hysts · August 10, 2025, 9:06am

BTW, I’ve restarted WAN 2.1 Fast & security - a Hugging Face Space by Heartsync and it’s back up too. So, my guess is that it’s due to some dependency updates.

John6666 · August 10, 2025, 11:26am

Thank you.
Well, maybe the combination of library versions at the time the space was launched was just bad.
Anyway, if it works fine after restarting, it’s probably not a big problem.

hysts · August 10, 2025, 11:58am

The dependency issue still isn’t resolved since we don’t know which library caused the error. Restarting a Space usually doesn’t rebuild it, so restarting fixed the original Space. But when you duplicate it, it triggers a rebuild and the dependencies that aren’t pinned get updated to the latest version. That’s why the Wan 2.2 Space still isn’t working.

hysts · August 10, 2025, 12:05pm

Looking at the code, I found a change in Gradio’s specifications.
I can’t believe it, but could Gradio’s repeated specification changes be causing code conflicts…?

Forgot to mention, but there’s a misunderstanding in this comment.

cache_examples still accepts “lazy” in gradio 5.x so that gradio 4.x Spaces that specify cache_examples=”lazy” won’t break when upgrading to gradio 5.x. It shows a warning that it will stop working in the future. You mentioned that we’ve changed it repeatedly, but that’s not the case.

John6666 · August 10, 2025, 1:04pm

Oh. In that space,

 --> RUN wget --progress=dot:giga https://developer.download.nvidia.com/compute/cuda/12.9.0/local_installers/cuda_12.9.0_575.51.03_linux.run -O cuda-install.run 	&& fakeroot sh cuda-install.run --silent --toolkit --override 	&& rm cuda-install.run

...

 --> RUN pip install --no-cache-dir pip -U && 	pip install --no-cache-dir 	datasets 	"huggingface-hub>=0.19" "hf_xet>=1.0.0,<2.0.0" "hf-transfer>=0.1.4" "protobuf<4" "click<8.1" "pydantic~=1.0" torch==2.8.0

So I modified requirements.txt:

#https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/download/v0.0.8/flash_attn-2.7.4.post1+cu126torch2.7-cp310-cp310-linux_x86_64.whl
https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/download/v0.3.14/flash_attn-2.6.3+cu129torch2.8-cp310-cp310-linux_x86_64.whl
numpy>=1.23.5,<2
einops

Then it worked for now.

If PyTorch is fixed at 2.8 due to CUDA Toolkit 12.9, some programs may not work…

hysts · August 10, 2025, 1:17pm

Oh, good catch! Not sure if the change to accept up to torch 2.8.0 is intentional since the documentation hasn’t been updated. I’ll ask internally.

X-Greg · August 10, 2025, 1:19pm

Hello, I’m experiencing the same “runtime error” on Pony Realism, and I don’t really understand how to resolve it. I’ve tried other applications like Image to Video, and after just a few videos, the problem also started to appear. I haven’t been able to generate any images for almost two days. This is starting to cause problems for my current work. Please help me. I need a simple solution that doesn’t require programming.

Topic		Replies	Views
ZeroGPU space : No CUDA GPUs are available Spaces	4	204	May 14, 2025
Error while initializing ZeroGPU Models	13	302	August 10, 2025
Perpetually building Spaces	3	55	August 30, 2024
Space giving run time error coz of timeout from Zero GPU Beginners	0	100	July 3, 2024
When i try to use ZeroGPU, it keeps saying "GPU Task aborted"" Beginners	8	1791	January 20, 2025