We’re invoking
API_URL = “https://api-inference.huggingface.co/models/mistralai/Mistral-7B-Instruct-v0.3 ” with our Access token.
We’re getting {
“detail”: “Model error: {"error":"Bad Request: Your endpoint is in error, check its status on endpoints.huggingface.co "}”
}
This is working fine still yesterday. Please suggest
2 Likes
It seems that errors may occur due to changes in the basic URL for API requests, models that have not been deployed, or other factors such as the influence of headers…
Whenever I want generate a response from the api it shows payment error 402 and this link “https://huggingface.co/api/inference-proxy/hf-inference/models/Qwen/QwQ-32B/v1/chat/completions ”. After tapping on the link it showing “Sorry, we can’t find the page you are looking for.” Also showing this promise error in console “You have exceeded your monthly included credits for Inference Providers. Subscribe to PRO to get 20x more monthly allowance.”, I haven’t use a bit. I create a new account for t…
Dear Hugging Face Support Team,
I hope you are doing well.
We are integrating the Dify knowledge-base tool with the Hugging Face Inference API to generate embeddings (feature-extraction pipeline). We now consistently receive the following 404 error:
404 Client Error: Not Found for url: https://api-inference.huggingface.co/pipeline/feature-extraction/intfloat/multilingual-e5-large-instruct
(Request ID: Root=1-683b4db9-1e980b4c755dcffd2fc32730; bdc8fc2e-733e-4d9b-afee-955f5c940bdb)
Until about…
We’re encountering a 404 Not Found error from the HuggingFace Inference endpoint when the request includes the X-Forwarded-Host header.
The issue appears to stem from the presence of this header, even if we use any private/public domain:
X-Forwarded-Host: google.com
Without Header – Works
When this header is removed, the request succeeds.
Identical payloads and endpoints return valid responses when the header is omitted.
With Header – Fails
If included (even with a valid public domain), t…
Also, there may have been a crash on the site during the time when the questions were asked.
I’m trying to get an existing app (OpenAI or Gemini both work well ) to run on open-weight models and keep failing. I have now distilled a minimal example that works on gpt-4.1-mini but doesn’t on Qwen3.
client = openai.Client()
MODEL = "gpt-4.1-mini"
messages = [
{"role": "user", "content": "You are a shopping assistant for a store. You can help pick the right products for the user."},
{"role": "user", "content": "I'm looking for a T-shirt"}
]
dummy_tools = [{
"type": "funct…