Difference between pinned models and Inference endpoints

stevhliu · November 14, 2022, 4:05pm

Hi @popaqy, here is a very high-level overview:

Pinned models (just a model preloaded for inference) are available through the Inference API, but it is only supported and available to existing paying customers. Otherwise, Inference API is a free product
Inference Endpoint is like the next iteration of pinned models, and it’ll build and deploy your model on its own secure Endpoint with cool autoscaling and security features. You can also choose your own CPU/GPU depending on your needs to keep costs low. Check this out if you need a production-ready environment for your model!

Topic		Replies	Views
How to pin a model on the Hub? 🤗Hub	1	478	March 28, 2023
Executing pinned inference model Models	1	312	May 4, 2023
Does a pinned model get automatically updated? Inference Endpoints on the Hub	8	1339	November 8, 2022
Error executing pinned inference model 🤗Hub	18	3780	December 10, 2021
About the Inference Endpoints on the Hub category Inference Endpoints on the Hub	3	1657	May 8, 2025