So now I need a dedicated endpoint to test most models? (only 34k out of 1.6 million supported)

martin359 · May 8, 2025, 11:59am

I am very confused with the serverless inference API as it seems that it has been thoroughly revamped recently. So before it was possible to spin up ANY model (maybe wait a bit longer if it is not “warm”) and then do inference on it…

Now, most models are not supported by any inference provider (only 34k out of 1.6 million models are supported), meaning the only way to test non-supported models is to deploy them locally (not feasible in many cases) or to run them on a dedicated endpoint (basically I would need to create a new dedicated endpoint based on model size, as quickly as possible test a few prompts and then destroy the endpoint to avoid any further charges) and rinse and repeat for other models?

Am I missing something here?

Topic		Replies	Views
Inference Provider Beginners	1	79	April 3, 2025
Difference between pinned models and Inference endpoints 🤗Hub	3	831	November 17, 2022
Inference API has been turned off for this model Beginners	0	989	June 6, 2023
Problem with API Beginners	1	34	May 15, 2025
How can I make my fine-tuned model supported by inference providers? Beginners	1	50	May 13, 2025

So now I need a dedicated endpoint to test most models? (only 34k out of 1.6 million supported)

Related topics