HF Inference API: 503/504 Server Error

applied-ai-concordia · March 31, 2025, 7:40pm

Hello,
I am trying to run inference as shown in the code snippet below, but always get a 504 or 503 server error. I’ve tried different models and different ways of calling the Inference API, but always run into the same problem. However, it appears that this problem is specific to my HF account since I’ve checked that others have been able to run the exact same code without issues. Does anyone know what could be going wrong?

CODE SNIPPET

from huggingface_hub import InferenceClient


# Initialize Hugging Face InferenceClient
client = InferenceClient(
   model="facebook/opt-1.3b",
   token="hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxx",
)


result = client.text_generation(
   prompt="Hello you are a chatbot, answer this ",
   model="facebook/opt-1.3b",
)
result

ERROR

HTTPError Traceback (most recent call last)
/usr/local/lib/python3.11/dist-packages/huggingface_hub/utils/_http.py in hf_raise_for_status(response, endpoint_name)
408 try:
→ 409 response.raise_for_status()
410 except HTTPError as e:

6 frames
HTTPError: 504 Server Error: Gateway Time-out for url: https://router.huggingface.co/hf-inference/models/facebook/opt-1.3b

The above exception was the direct cause of the following exception:

HfHubHTTPError Traceback (most recent call last)
/usr/local/lib/python3.11/dist-packages/huggingface_hub/utils/_http.py in hf_raise_for_status(response, endpoint_name)
479 # Convert HTTPError into a HfHubHTTPError to display request information
480 # as well (request id and/or server error message)
→ 481 raise _format(HfHubHTTPError, str(e), response) from e
482
483

HfHubHTTPError: 504 Server Error: Gateway Time-out for url: https://router.huggingface.co/hf-inference/models/facebook/opt-1.3b

John6666 · April 1, 2025, 11:08am

The API seems to be in a bad state at the moment.

#model_id = "facebook/opt-1.3b" # No response for a long time...
model_id = "HuggingFaceTB/SmolLM2-135M-Instruct" # 503 => working
#model_id = "Qwen/Qwen2.5-3B-Instruct" # 503 => no response for a long time...

HF_TOKEN = "hf_my_pro_token***"

# Initialize Hugging Face InferenceClient
client = InferenceClient(
   model=model_id,
   token=HF_TOKEN,
   provider="hf-inference",
   timeout=600,
)

result = client.text_generation(
   prompt="Hello you are a chatbot, answer this ",
   model=model_id,
)

print(result)

Topic		Replies	Views
Getting 504 HTTP error status using serverless HF inference api Inference Endpoints on the Hub	4	66	March 3, 2025
Constant 503 error for several days when running LLAMA 3.1 Inference Endpoints on the Hub	5	317	April 25, 2025
HfHubHTTPError: 502 Server Error: Bad Gateway for url: https://api-inference.huggingface.co/models/HuggingFaceH4/zephyr-7b-beta Intermediate	0	343	March 13, 2024
HF Inference Endpoints Error 429 Inference Endpoints on the Hub	2	68	March 27, 2025
504 error with serverless HF Inference API Inference Endpoints on the Hub	1	35	March 17, 2025

HF Inference API: 503/504 Server Error

ERROR

Related topics