Rate Limit When Using Gradio and Inference API

awacke1 · September 22, 2022, 7:56pm

I’ve experienced a rate limit problem a few times now, always when I am teaching (or preparing) and it happens when I begin using the Gradio Inference API where I don’t preload using requirements. As far as I can tell it happens only in those conditions and usually after I used a large model for the first or second time. It happens when I build examples too.

Is there a way in code to supply the token so this doesn’t occur (since I thought Pro was supposed to raise that but maybe I’m missing the code to add that token so I don’t get an error.

If I can fix it great. If I can’t then it reduces my good gradio examples which have the benefit of being simpler since you can skip requirements.txt since gradio alone can do it all in app.py which is a super nice feature for teaching.

Any help you can offer would be great. We have trained over 300 people now internally and gradio is half the show Keep up the amazing work. Thanks so much!

–Aaron

The space and code sample below show the issue. It errors on runtime but it starts when I load another gradio only model which I assume is exceeding a size limit during building…

import gradio as gr

context = “This could be any large text corpus to use as subject matter to ask questions about. You can load it as well from text file to isolate it from code changes like in the next line”

with open(‘Context.txt’, ‘r’) as file:
context = file.read()

question = “What should be documented in a care plan?”

gr.Interface.load(
“huggingface/deepset/roberta-base-squad2”,
theme=“default”,
css=“.footer{display:none !important}”,
inputs=[gr.inputs.Textbox(lines=12, default=context, label=“Context paragraph”), gr.inputs.Textbox(lines=3, default=question, label=“Question”)],
outputs=[gr.outputs.Textbox(label=“Answer”), gr.outputs.Textbox(label=“Score”)],
title=None,
description=“Provide your own paragraph and ask any question about the text. How well does the model answer?”).launch()

freddyaboulton · September 22, 2022, 8:09pm

Hi @awacke1 !

Can you try passing your token to the api_key parameter of the gr.Interface.load ? Docs here

Hopefully this fixes your problem. Thank you for spreading the word about gradio! Let us know how else we can help.

awacke1 · September 22, 2022, 8:31pm

Thanks Freddy! You’re the best

awacke1 · September 22, 2022, 9:27pm

Here is what I did - let me know if this violates any best practices…

First I create a new User Access Token under profile, User Access Tokens and Set to Write a new token called HF_TOKEN. This worked for me before to deploy from external github using github actions to HF on commit which I thought was a great feature…

Next in my Org classroom space, I modified the space to add a new secret called “HF_TOKEN” then added the pasted token value.

Last I modified code as below assuming the secret value will be coming from the local Org Space’s secret.

API_KEY = os.environ.get(“HF_TOKEN”)
gr.Interface.load(
“huggingface/deepset/roberta-base-squad2”,
api_key=API_KEY,
theme=“default”,
css=“.footer{display:none !important}”,
inputs=[gr.inputs.Textbox(lines=12, default=context, label=“Context paragraph”), gr.inputs.Textbox(lines=3, default=question, label=“Question”)],
outputs=[gr.outputs.Textbox(label=“Answer”), gr.outputs.Textbox(label=“Score”)],
title=None,
description=“Provide your own paragraph and ask any question about the text. How well does the model answer?”).launch()

Is that the correct method? I noticed this spaces example used similar to modify datasets and am curious if that write access by runtime spaces is done the same way since that is an obscure cool feature: Persistent Data - a Hugging Face Space by elonmuskceo

Last with the classroom org, if I keep that secret there would it be hidden from my classroom invitees and will I then be able to see usage on dashboard? e.g. discovered this:
https://api-inference.huggingface.co/dashboard/usage/2022/8

and found I can also set the dashboard to view usage for my Orgs. Is that true? I would expect that those amounts should start going up. Is it actually using my Access Token called “HF_TOKEN” or is it using the value I added as the copied token sitting inside my Org space?

Thanks again and the CI/CD workflow demo for github actions to replicate changes from open source repos on external github will show later this month for our open source community which has about 890 people. I think that capability is the easiest fastest way to synchronize a one click auto-deploy. Compared with Jenkins its a dream!

freddyaboulton · September 23, 2022, 2:36pm

Hi @awacke1 !

Yep that’s how I would do it too! The I would change about your setup is that I would only use a read-access token instead of a write access token since you don’t actually need to write to any assets in your space.

To answer your questions:

Members of your classroom org would not be able to access the secret value. If they have write access in your org, they could delete the secret but if they are your students I think you can trust them to not troll the class like that lol
About the usage dashboard I’m actually not 100% sure but I think if you use a personal access token, the usage would be tallied under your account. If you use an organization token, it would be tallied under the organization.
The space is using the copied token you added as a secret to the space. Spaces don’t have access to your api tokens, only the secrets you add to the space.

Let us know if you have other questions!

Topic		Replies	Views
Use authentication in huggingface Gradio API!(hosting on ZeroGPU) Spaces	4	327	March 21, 2025
How to use gradio client library in next.js to call API built using gradio space and the space is private? 🔒 Gradio	0	502	October 6, 2023
API Usage for Gradio apps Spaces	0	181	August 18, 2024
Unable to inference from space in python using API Beginners	4	129	October 29, 2024
Exceeded GPU quota Beginners	15	2751	September 23, 2024

Rate Limit When Using Gradio and Inference API

Related topics