Performance of hosted inference API

bala1802 · February 16, 2021, 6:21am

For any one fine-tuned GPT-2 model, would like to understand the performance (response time and number of concurrent requests that can be served) of the hosted inference API of Hugging Face in ‘Lab’ mode and ‘Startup’ mode?

Topic		Replies	Views
Organization Pricing Beginners	1	420	February 22, 2021
Disable Hosted inference API 🤗Hub	4	1800	September 30, 2021
Inference API detailed request Beginners	5	2357	September 11, 2020
Inference API has been turned off for this model Beginners	0	996	June 6, 2023
Inference API offline model limit 🤗Transformers	1	924	May 2, 2024

Performance of hosted inference API

Related topics