Difference between pinned models and Inference endpoints

Hi @popaqy, here is a very high-level overview:

  1. Pinned models (just a model preloaded for inference) are available through the Inference API, but it is only supported and available to existing paying customers. Otherwise, Inference API is a free product :slight_smile:
  2. Inference Endpoint is like the next iteration of pinned models, and it’ll build and deploy your model on its own secure Endpoint with cool autoscaling and security features. You can also choose your own CPU/GPU depending on your needs to keep costs low. Check this out if you need a production-ready environment for your model!
2 Likes