Hugging Face API timeouts on all models!

GeorginaSteele · September 17, 2025, 11:37am

I’m experiencing multiple timeouts on all models on the API. I’m currently using the free tier. Is anyone else experiencing this issue? is using the free tier part of the issue?

status: 504,
statusText: ‘Gateway Time-out’,

John6666 · September 17, 2025, 12:17pm

Hmm…? It works for me.

import os
from huggingface_hub import InferenceClient

HF_TOKEN = "hf_***my_read_token***"

client = InferenceClient(
    provider="hf-inference",
    api_key=HF_TOKEN,
)

completion = client.chat.completions.create(
    model="HuggingFaceTB/SmolLM3-3B",
    messages=[
        {
            "role": "user",
            "content": "What is the capital of France?"
        }
    ],
)

print(completion.choices[0].message)
"""
ChatCompletionOutputMessage(role='assistant', content="<think>\nOkay, the user is asking for the capital of France. Let me make sure I remember correctly. I think it's Paris. Wait, is there any chance they might be confused with another city? Maybe someone might think it's Lyon or Marseille? No, those are major cities but not the capital. The capital of France is definitely Paris. I 
should confirm that. Let me think if there's any historical context that could change this. No, Paris has been the capital since the 10th century. There's also the Eiffel Tower and the Louvre, 
which are famous landmarks there. Yeah, I'm pretty confident it's Paris. I should just state that clearly and maybe add a bit about it being the largest city in France to give more context.\n</think>\n\nThe capital of France is **Paris**. It is not only the largest city in the country but also a major global cultural, economic, and political hub. Paris is renowned for landmarks like 
the Eiffel Tower, the Louvre Museum, and the Seine River, and has been the heart of French history and politics for centuries.", tool_call_id=None, tool_calls=[], refusal=None, annotations=None, audio=None, function_call=None, reasoning_content=None)
"""

No major server issues have been detected so far.

meganariley · September 17, 2025, 3:11pm

Hi @GeorginaSteele Are you still experiencing this issue? Can you please share the model you’re using?

joshmaiven · September 18, 2025, 9:31am

Same issue for me @meganariley,
the core one i’m trying to use here is BAAI/bge-large-en-v1.5, keeps timing out

I switched to try a Deepseek one also timed out. It’s not all models as BAAI/bge-small-en-v1.5 was working.

not sure what the issue is as it hasn’t been working for around a day now

joshmaiven · September 18, 2025, 9:42am

Never mind, looks like it has started working again

It was out for ~4-5 hours yesterday though while I was testing then

GeorginaSteele · September 18, 2025, 10:37am

I was using BAAI/bge-large-en-v1.5 primarily and then switched to a few others. It’s now working again as @joshmaiven pointed out

Topic		Replies	Views
HF Inference API: 503/504 Server Error Inference Endpoints on the Hub	4	328	September 5, 2025
Inference API time out? Site Feedback	2	928	February 28, 2024
Timed out after 3 seconds Beginners	0	14	August 10, 2024
Inference API timeout Site Feedback	0	189	May 29, 2024
Cannot execute any model with my API Token, models are timed out Inference Endpoints on the Hub	6	2908	May 1, 2025

Hugging Face API timeouts on all models!

Related topics