HF Inference API: 503/504 Server Error

Hello,
I am trying to run inference as shown in the code snippet below, but always get a 504 or 503 server error. I’ve tried different models and different ways of calling the Inference API, but always run into the same problem. However, it appears that this problem is specific to my HF account since I’ve checked that others have been able to run the exact same code without issues. Does anyone know what could be going wrong?

CODE SNIPPET

from huggingface_hub import InferenceClient


# Initialize Hugging Face InferenceClient
client = InferenceClient(
   model="facebook/opt-1.3b",
   token="hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxx",
)


result = client.text_generation(
   prompt="Hello you are a chatbot, answer this ",
   model="facebook/opt-1.3b",
)
result

ERROR

HTTPError Traceback (most recent call last)
/usr/local/lib/python3.11/dist-packages/huggingface_hub/utils/_http.py in hf_raise_for_status(response, endpoint_name)
408 try:
→ 409 response.raise_for_status()
410 except HTTPError as e:

6 frames
HTTPError: 504 Server Error: Gateway Time-out for url: https://router.huggingface.co/hf-inference/models/facebook/opt-1.3b

The above exception was the direct cause of the following exception:

HfHubHTTPError Traceback (most recent call last)
/usr/local/lib/python3.11/dist-packages/huggingface_hub/utils/_http.py in hf_raise_for_status(response, endpoint_name)
479 # Convert HTTPError into a HfHubHTTPError to display request information
480 # as well (request id and/or server error message)
→ 481 raise _format(HfHubHTTPError, str(e), response) from e
482
483

HfHubHTTPError: 504 Server Error: Gateway Time-out for url: https://router.huggingface.co/hf-inference/models/facebook/opt-1.3b

1 Like

The API seems to be in a bad state at the moment.

#model_id = "facebook/opt-1.3b" # No response for a long time...
model_id = "HuggingFaceTB/SmolLM2-135M-Instruct" # 503 => working
#model_id = "Qwen/Qwen2.5-3B-Instruct" # 503 => no response for a long time...

HF_TOKEN = "hf_my_pro_token***"

# Initialize Hugging Face InferenceClient
client = InferenceClient(
   model=model_id,
   token=HF_TOKEN,
   provider="hf-inference",
   timeout=600,
)

result = client.text_generation(
   prompt="Hello you are a chatbot, answer this ",
   model=model_id,
)

print(result)

Hello,
I am getting 504 Gateway Timeout errors on API’s, i have tried more other API’s too, getting same error.
@John6666 please take a look

curl --location --request POST ‘https://api-inference.huggingface.co/models/facebook/bart-large-mnli’ \

--header ‘Authorization: Bearer xxxxxxxxxxxx’ \

--header ‘Content-Type: application/json’ \

--data-raw '{

“inputs”: “I love using n8n for workflows!”,

“parameters”: {

“candidate_labels”: [“automation”, “sports”, “finance”]

}

}

1 Like

I tried and also got 504 timeout for both https://api-inference.huggingface.co/models/facebook/bart-large-mnli and https://router.huggingface.co/hf-inference/models/facebook/bart-large-mnli for now.
It doesn’t seem to be a server glitch. @tomaarsen

Same here, has been failing for two days on my end. HF Enterprise support has been looking into it since then but have apparently been unable to resolve it.

1 Like