Usage quota exceeded

when i try to use a space, it gives me an error saying “You have exceeded your GPU quota (59s left vs. 120s requested). Sign-up on Hugging Face to get more quotas or retry in 5:03:25”, but im already signed-up, it started after i bought PRO today

It hasn’t worked for six months now. It’s often a problem on the forums. I’m aware of two recent cases.
I’m not sure if it’s a bug or a spec.

@CaioXapelaum I am also facing the same issue, how did you resolve it?

This could be due to a delay in updating your quota after the upgrade, or you might have already reached the Pro plan’s GPU limit, which is still capped. Check your usage on the account page to ensure you’re within limits. If everything seems correct, try reducing your task’s resource requirements or logging out and back in to resync your account. If the issue persists, contacting Hugging Face support for assistance might resolve the problem.

Forgot to correct this post. I was wrong all along, but it looks like the following sign-in button is all that is needed to mitigate Quota in the Zero GPU space.
I put this on my space and it now has quota mitigation.
The solution for other people’s spaces that don’t have this button is unknown at this time.

1 Like

How to check this usage? Please guide me

2 Likes

I got it to work!!!

  • I duplicated the black-forest-labs/FLUX.1-dev (!!!Important!!! Go to their Model Card Page and get access granted to their gated model first. Then you can duplicate the model)
  • Set the environment variables:
    HF_TOKEN =Use your HF_token (I used read and write credentials for this)
    ZEROGPU_V2=true
    ZERO_GPU_PATCH_TORCH_DEVICE=1

Then in your own (duplicated) space:

  • Navigate to the Files
  • Click on app.py
  • Change
@spaces.GPU(duration=75)  # The duration max value can be 120 but this wasn't enough and still didn't work for me

to

@spaces.GPU()

Make sure in your python code that uses the gradio_client python library your HF_TOKEN is set in the environment or you set the parameter hf_token when creating the client
example:

from gradio_client import Client 
 client = CLIENT("your_duplicated_space/FLUX.1-dev", hf_token=os.getenv("HF_TOKEN"))

Basically, anywhere the decorator @spaces.GPU is being set you are being limited by the spaces owner. You can then duplicate/clone it and make your changes for your own space .

1 Like

Using the HF_TOKEN like this: client = CLIENT(“your_duplicated_space/FLUX.1-dev”, hf_token=os.getenv(“HF_TOKEN”)) allow us to use the space wether is ours or from someone else,

BUT IT DON’T, uses our PRO Quota, it consider us as a normal user with no PRO limits. Seems that nobody has being able to help us in that issue :frowning:

1 Like

Related issue?

Hi, @John6666, thank you for the solution!

Did you try to send a requests through API, does a button helps in this case?

Really waiting for your answer, cause I am stacked with calling private space through an API and facing Exceed GPU quota issue.

1 Like

As you have written in the discussion below…:sweat_smile:
The discussion below is the most recent. It’s not so much a solution as a struggle to figure out who to ask and what to ask.
I don’t know which is the cause: Gradio, the HF Private function, or the Zero GPU Space function… it could be more than one.

Edit:
I called hysts for now…