Usage quota exceeded

I got it to work!!!

  • I duplicated the black-forest-labs/FLUX.1-dev (!!!Important!!! Go to their Model Card Page and get access granted to their gated model first. Then you can duplicate the model)
  • Set the environment variables:
    HF_TOKEN =Use your HF_token (I used read and write credentials for this)
    ZEROGPU_V2=true
    ZERO_GPU_PATCH_TORCH_DEVICE=1

Then in your own (duplicated) space:

  • Navigate to the Files
  • Click on app.py
  • Change
@spaces.GPU(duration=75)  # The duration max value can be 120 but this wasn't enough and still didn't work for me

to

@spaces.GPU()

Make sure in your python code that uses the gradio_client python library your HF_TOKEN is set in the environment or you set the parameter hf_token when creating the client
example:

from gradio_client import Client 
 client = CLIENT("your_duplicated_space/FLUX.1-dev", hf_token=os.getenv("HF_TOKEN"))

Basically, anywhere the decorator @spaces.GPU is being set you are being limited by the spaces owner. You can then duplicate/clone it and make your changes for your own space .

1 Like