Request failed: 500

hrayrwannis · February 4, 2025, 6:40am

Just started doing some inference task through the API yesterday it worked quite fine today am getting:
Request failed: 500 {“error”: “Model too busy, unable to get response in less than 60 second(s)”}
although it keeps trying again but super slow - is there any issue with HF servers or something?

thanks,
Ed

HF_API_URL = “https://api-inference.huggingface.co/models/deepseek-ai/DeepSeek-R1-Distill-Qwen 32B”

headers = {
“Authorization”: f"Bearer {HF_TOKEN}",
“Content-Type”: “application/json”,
}

def get_model_response(sentence, prompt_template):
prompt = prompt_template.format(sentence=sentence)
payload = {
“inputs”: prompt,
“parameters”: {
“temperature”: 0.3,
“max_new_tokens”: 256,
“top_p”: 0.7,
“mask_instructions”: True
}
}
  response = requests.post(HF_API_URL, headers=headers, json=payload)
  
  if response.status_code == 200:
      data = response.json()
      try:
          content = data[0]["generated_text"].strip()
          return content
      except (IndexError, KeyError):
          return None
  else:
      print("Request failed:", response.status_code, response.text)
      return None

John6666 · February 4, 2025, 7:12am

Same 500 here.
Qwen/Qwen2.5-72B-Instruct is responding normally, so it’s possible that DeepSeek R1 is just really busy…

The staff reported that something was wrong with the Hugging Face server or domain a few days ago and that it was being adjusted (details unknown).

hrayrwannis · February 4, 2025, 8:13am

thanks for your response much appreciated.

I did try with the Qwen but i seem to be getting hit and miss the same error -

get_model_response(temptext, btl_prompt_template)
Request failed: 500 {“error”: “Model too busy, unable to get response in less than 60 second(s)”}

Is there anywhere to indicate if HF servers are down or something?

thanks,

John6666 · February 4, 2025, 8:26am

Is there anywhere to indicate if HF servers are down or something?

You can find information about the overall crash below, but “sometimes it works and sometimes it doesn’t” is barely working, so I don’t think it will be reflected.

hrayrwannis · February 4, 2025, 8:49am

thanks again - that seems to indicated the serves are up today so not sure what’s the issue.

data[“crc_response”] = [get_model_response(sentence, crc_prompt_template) for sentence in tqdm(data[“Text”], desc=“Processing CRC tags”)]
Request failed: 500 {“error”: “Model too busy, unable to get response in less than 60 second(s)”}
Request failed: 500 {“error”: “Model too busy, unable to get response in less than 60 second(s)”}
Processing CRC tags: 0%| | 2/400 [01:02<4:03:19, 36.68s/it]
Processing CRC tags: 4%|▎ | 14/400 [02:55<2:14:16, 20.87s/it]
Processing CRC tags: 6%|▌ | 22/400 [03:10<23:11, 3.68s/it]

Ed

ShrikAI · February 21, 2025, 3:13am

I have recently started using this service, it didnt work for me yet. I am sure the URL (https://api-inference.huggingface.co/models/bigcode/starcoder/ )seem to be correct, guess something to do with server capacity.
Are there any other options to make it work ? Anyone please suggest.

Error:
500 - “{"error": "Model too busy, unable to get response in less than 60 second(s)"}”

charliebaby2023 · March 4, 2025, 3:17am

SMH , i dont think i can really accept the answers ive been observing.
with all due respect to everyone , from the bottom of my heart, i understand we’re all in the same boat,
these issues stand out to me and have been for some time

(0. python, and gradio, and HF, are NOT my first languages, im amazed i even imported gradio)

Is there any staff here? what is the purpose of HF and to whom are they catering ?
im seeing 500, 504 , 404 (and a small amount of others) with thousands of image gen models.
but when i change or shift my load anywhere between 3 - 30 , I SWEAR (but havent confirmed to absolutes) that my 500 and 504 errors will shift over many of the models
and those that havent loaded will, and those that do, wont

but 404 errors, from the models of a SPECIFIC user cough cough,
(for the record , Im not blaming ANY of us,
especially you SPECIFIC user.
I FULLY appreciate all it is that you do for everyone here)
–remain

token or not
and only for that specific user’s models

i think HF new 3rd party inference service rollout’s timeline suggests it the culprit to blame, but i can really only accept that when it comes to the 404 errors
not so terribly much on all the others
because of the tendency they have to shift , when i shift my workload
ive been here bout 300 days, ive seen HF has its server side ups and downs, ive faught and cried over threads and ques and resets , ive come to terms with all these struggles

but suddenly,
a harsh and long, 2 week connection disruption
with only the hint of "its server side, we must wait "
and then finding they’ve actually SCHEUDALED this 3rd party service rollout
(only indicator of that was their little blue NEW tag on it)
then i can ONLY sit back and say

so that leaves question 1 as the only important question
i would assume since the answer isnt US or they’d be here or they would have informed us better, that maybe the 3rd parties are the answer

considering HF has just sat on this “its a server issue” issue, im thinking the 3rd parties are loosing interest as are many of us

and of course #0, but thats soley in my court

if all im getting are 404 and 500 errors
for a month
and a blank screen

then my ignorance surely will floweth over eventually

(this is just really a prayer for some hope or some help.
i usually avoid forums and obtain my own solutions)

so if you’re there and made it here,
all i can say with certainty at this point
is
thank you for listening

and i guess we should just wait another 2 weeks?
(throws all my paper work into the air above me)

John6666 · March 4, 2025, 6:29am

@julien-c I know that I shouldn’t normally be mentioning you, but please reassure us…

Topic		Replies	Views
Request to Serverless Inference API failed with 400 status code Inference Endpoints on the Hub	3	253	July 28, 2025
Internal Error on using HF models 🤗Hub	5	264	April 15, 2025
500 Internal Error - We're working hard to fix this as soon as possible 🤗Transformers	44	1968	April 25, 2025
HF Inference API: 503/504 Server Error Inference Endpoints on the Hub	1	238	April 1, 2025
Inference API stopped working Inference Endpoints on the Hub	50	4227	June 8, 2025

Request failed: 500

Related topics