What should I do when I meet rate limit?

fifane · November 23, 2023, 5:45am

{‘error’: [{‘message’: ‘update vector: failed with status: 429 error: Rate limit reached. You reached PRO hourly usage limit. Use Inference Endpoints (dedicated) to scale your endpoint.’}]}

What should I do when I meet rate limit reached?

Culture-and-Morality · July 7, 2024, 12:49am

Did you find any response?

fifane · July 7, 2024, 3:22am

I solved this problem long ago. thanks

meganariley · July 9, 2024, 9:20pm

Hi @Culture-and-Morality Thanks for posting! The free Inference API is a solution to easily explore and evaluate models, and Inference Endpoints is our new paid inference solution for production use cases. So for larger volumes of requests, or if you need guaranteed latency/performance, we recommend using Inference Endpoints instead to easily deploy your models on dedicated, fully-managed infrastructure. Inference Endpoints will give you the flexibility to quickly create endpoints on CPU or GPU resources, and is billed by compute uptime vs character usage. Further pricing information can be found here.

Please let me know if you have additional questions.

Topic		Replies	Views
Rate limit lowered? Beginners	0	864	February 23, 2024
API Limit in PRO Beginners	2	136	March 5, 2025
Rate limit reached. You reached free usage limit (reset hourly) Models	5	10118	May 14, 2024
429 Errors and Model Overloaded - Dedicated Endpoint Inference Endpoints on the Hub	0	33	October 22, 2024
Hugging Face API rate limits Beginners	15	15008	June 11, 2025

What should I do when I meet rate limit?

Related topics