Does the token allowance refresh on a monthly basis?

Azuremis · January 15, 2023, 5:44pm

I’ve recently started using the inference API and my dashboard shows that I’ve nearly used up my allowance.

Screenshot 2023-01-15 at 17.42.42

Does this allowance refresh on a monthly basis or will I need to purchase a plan in order to keep using it?

Please clarify.

philschmid · January 16, 2023, 8:32am

michellehbn · January 16, 2023, 11:16am

Hi @Azuremis ! Thanks for reaching out and happy new year! For larger volumes of requests, or if you need guaranteed latency/performance, you can use our new solution Inference Endpoints to easily deploy your models on dedicated, fully-managed infrastructure. Inference Endpoints will give you the flexibility to quickly create endpoints on CPU or GPU resources, and is billed by compute uptime vs character usage. Further pricing information can be found here. Our PRO subscription will give you higher Inference API rate limits than the free Inference API plan, and the limit allowance is refreshed monthly. Please let us know if there are any other questions! Thanks again!

Azuremis · January 16, 2023, 4:13pm

Thank you for raising @philschmid and @michellehbn for clarifying how the inference endpoint refresh, rates and performance works. My query has been perfectly answered

Topic		Replies	Views
Is the price adjusted with autoscaling? Inference Endpoints on the Hub	0	927	September 29, 2022
Misunderstanding about inference endpoint billing Beginners	2	766	February 5, 2025
Inference API Rate Limits Inference Endpoints on the Hub	1	59	May 16, 2025
Inference API cost changed for meta-llama-3.3-70b? Inference Endpoints on the Hub	3	192	April 13, 2025
Hugging face inference support and quota Inference Endpoints on the Hub	3	112	March 7, 2025

Does the token allowance refresh on a monthly basis?

Related topics