Persistent 404 Not Found Errors with Public Inference API

Problem:
For the past day or so, all My attempts to make POST requests to the public Inference API endpoint result in a 404 Not Found error. This happens regardless of the model I try to query, including standard, known-available models like gpt2. The response body simply contains “Not Found"“

My Hugging Face Username: Mehdimemar

Troubleshooting Steps Taken:

  1. Model Validity Confirmed: I’ve tested numerous valid model IDs (like gpt2, distilbert-base-uncased-fine tuned-sst-2-english, various Segmentation models). The 404 error occurs consistently.

  2. Access Token Verified: I have generated multiple new User Access Tokens from my account settings with the read role. I’ve carefully copied them and ensured they are correctly formatted in the Authorization: Bearer YOUR_HF_TOKEN header. Tried to write tokens as well, same result.

  3. Network Connectivity Verified: nslookup, ping, and tracert to api-inference.huggingface.co are all successful from my testing environment. General internet connectivity is working fine (tested against httpbin.org).

  4. Direct curl Test (Outside other platforms): To isolate the issue, I performed direct tests using curl from my local machine. These tests also result in the same 404 Not Found error. Example command available upon request.

  5. Checked Hugging Face Status Page: The status page [1: status] indicates services are operational, though HF Inference shows some past instability. The persistent 404 error doesn’t seem like a temporary service unavailability (usually 503).

  6. Checked Account Settings: I’ve reviewed my account settings (Tokens, Billing [though not required for public API], etc.) via [ 2: settings/tokens] and haven’t found any obvious issues, restrictions, or required actions. My email is verified

3 Likes

Conclusion / Question:

Given that network connectivity is fine, valid models are being used, valid tokens (with correct permissions) seem to be sent correctly (verified by curl -v), the issue strongly suggests a problem with token validation specific to my account (Mehdimemar) or an unknown restriction/status issue with my account preventing Inference API access.

Has anyone else experienced similar persistent 404 errors recently? Is there anything specific I should double-check, or could this require investigation by the Hugging Face team?

1 Like

I think pretty much all users are in that state…

1 Like

New to the huggingface, and exprienced this issue too…

1 Like

I am also experiencing the same issue and I checked with many others too. This issue is been coming for few days. I think its some bug in their system and some its because they are shifting to other inference providers

1 Like

I am also facing this same issue.

1 Like

Me too. I need to do a huge training task for an application, and I paid for the pro tier, plus already paid a good chunk extra for the first part of the training task. Really frustrating that I now am stuck without being able to complete the task even though I forked out a bunch. Is there no way to get an official reply? I haven’t seen any way of contacting support, if it exists. The official status page says all systems go.

2 Likes

For libraries, it is best to contact the developer via GitHub.

There are several ways to contact support for general issues with Hub.

website@huggingface.co

Hi all, thanks for reporting! You can check to see if your model is available to use with the HF Inference API (or any Inference Provider) here: Models - Hugging Face. If it’s not deployed by any Inference Provider, you can request provider support on the model page.

Please note Inference Endpoints is available to use - more info here: Inference Endpoints.

Thanks!

1 Like

Yes me too i faced the issue with some models but for the following two models it works fine:

  • model: “mistralai/Mixtral-8x7B-Instruct-v0.1”,
  • model: ‘meta-llama/Llama-3.3-70B-Instruct’,

Note that i have checked few only.

1 Like

Hi there! has this been resolved?

1 Like

As mentioned above, if the model is currently deployed, it should be available via the Inference Provider.

Please note that the program, or rather the Endpoint URL, has changed slightly.

I’m using “Qwen/Qwen2.5-Coder-32B-Instruct” model and was getting the same/similar error. I modified the initialization a little bit, adding max_token, provider and etc. and it worked for me. Here’s how I initialize it:

llm = HuggingFaceInferenceAPI(
    model_name="Qwen/Qwen2.5-Coder-32B-Instruct",
    temperature=0.7,
    max_tokens=100,
    provider="auto"
)

In my case, I think the issue was with max_token, because I was almost running out of free tokens, so that could be a confusing issue really, and the error doesn’t say it’s payment issue unless you’ve completely used your free tokens.

Sometimes it could be because of running out of free tokens too, so take a close look at the error message.

Try assigning a lower max_token and see how it’ll work.

1 Like