Anyone else VERY confused?

Just upgraded to ‘PRO’…

I thought that would mean I could get APIs/endpoints to connect my projects to different models…

But… Nope.

Still will have to purchase at a minimum .60/hr hosted GPU Inference Endpoints…

Is that correct? How are we supposed to be using this platform? My head is spinning.

hi @squatchydev9000 ,

Apologies for the confusion, as a Pro user you can access Inference for these special large LLM, read more here as well as higher rate limits for thousands of compatible models on the hub see all tasks here

For custom GPU hardwares and Inference Endpoints follow the pricing here and here

Hugging Face PRO users now have access to exclusive API endpoints for a curated list of powerful models that benefit from ultra-fast inference powered by text-generation-inference. This is a benefit on top of the free inference API, which is available to all Hugging Face users to facilitate testing and prototyping on 200,000+ models. PRO users enjoy higher rate limits on these models, as well as exclusive access to some of the best models available today.

Here are the special list of large language models running with TGI

Model ID Task
google/flan-t5-xxl text2text-generation
google/flan-ul2 text2text-generation
bigcode/starcoder text-generation
codellama/CodeLlama-13b-hf text-generation
codellama/CodeLlama-34b-Instruct-hf text-generation
HuggingFaceH4/starchat-beta text-generation
HuggingFaceH4/zephyr-7b-alpha text-generation
HuggingFaceH4/zephyr-7b-beta text-generation
HuggingFaceM4/idefics-80b-instruct text-generation
meta-llama/Llama-2-70b-chat-hf text-generation
mistralai/Mistral-7B-Instruct-v0.1 text-generation
mistralai/Mistral-7B-Instruct-v0.2 text-generation
mistralai/Mistral-7B-v0.1 text-generation
openchat/openchat_3.5 text-generation
tiiuae/falcon-180B-chat text-generation
tiiuae/falcon-7b-instruct text-generation
bigscience/bloom text-generation