For meta-llama and mistral text generation LLM using the InferenceClient(), I’m getting: Bad Request: The endpoint is paused, ask a maintainer to restart it
Is something not working at HF?
For meta-llama and mistral text generation LLM using the InferenceClient(), I’m getting: Bad Request: The endpoint is paused, ask a maintainer to restart it
Is something not working at HF?
Is it explicitly PAUSED? Maybe it’s under maintenance or something… @michellehbn @meganariley
Hi @gtvracer, Thanks for reaching out and for being PRO On the model page, you can request provider support of that model if the model is not deployed and available for use through Inference Providers.
As an example, meta-llama/Llama-3.1-8B-Instruct is currently available using providers like Featherless AI, Nscale, SambaNova, Fireworks, Hyperbolic, etc. You can also deploy models with Inference Endpoints (dedicated): Inference Endpoints
To see which models are available to use with HF Inference, check out our filtered search here: Models - Hugging Face
Ok. Text generation models are no longer available through HF Inference API: Models - Hugging Face
Is this intended?
This would be a massive bummer haha
Exception:504 Server Error: Gateway Time-out for url: https://api-inference.huggingface.co/models/meta-llama/Llama-3.3-70B-Instruct/v1/chat/completions
Currently, it appears that no text generation models have been deployed in HF-Inference. However, it seems that some have been deployed in other Inference Providers…
How can HF suddenly withdraw support from InferenceClient’s that worked yesterday, going forward? Without warning or anything? Very unprofessional and very disappointing…
Make sure you have your huggingface-hub updated to 0.33.2. It will know how to accept the provider parameter. I used “nebius” for meta-llama models and it worked.
use: provider=“together” for mistralai models..
Almost official news from HF DevRel on Discord
yes! these models were sunset as part of us closing down hugging.chat unfortunately
however there’s quite a lot of models that you can use through our inference providers as a replacement