Request failed: 500

Just started doing some inference task through the API yesterday it worked quite fine today am getting:
Request failed: 500 {“error”: “Model too busy, unable to get response in less than 60 second(s)”}
although it keeps trying again but super slow - is there any issue with HF servers or something?

thanks,
Ed

HF_API_URL = “https://api-inference.huggingface.co/models/deepseek-ai/DeepSeek-R1-Distill-Qwen 32B”

headers = {
“Authorization”: f"Bearer {HF_TOKEN}",
“Content-Type”: “application/json”,
}

def get_model_response(sentence, prompt_template):
prompt = prompt_template.format(sentence=sentence)
payload = {
“inputs”: prompt,
“parameters”: {
“temperature”: 0.3,
“max_new_tokens”: 256,
“top_p”: 0.7,
“mask_instructions”: True
}
}

  response = requests.post(HF_API_URL, headers=headers, json=payload)
  
  if response.status_code == 200:
      data = response.json()
      try:
          content = data[0]["generated_text"].strip()
          return content
      except (IndexError, KeyError):
          return None
  else:
      print("Request failed:", response.status_code, response.text)
      return None
1 Like

Same 500 here.
Qwen/Qwen2.5-72B-Instruct is responding normally, so it’s possible that DeepSeek R1 is just really busy…

The staff reported that something was wrong with the Hugging Face server or domain a few days ago and that it was being adjusted (details unknown).

1 Like

thanks for your response much appreciated.

I did try with the Qwen but i seem to be getting hit and miss the same error -

get_model_response(temptext, btl_prompt_template)
Request failed: 500 {“error”: “Model too busy, unable to get response in less than 60 second(s)”}

Is there anywhere to indicate if HF servers are down or something?

thanks,

1 Like

Is there anywhere to indicate if HF servers are down or something?

You can find information about the overall crash below, but “sometimes it works and sometimes it doesn’t” is barely working, so I don’t think it will be reflected.

thanks again - that seems to indicated the serves are up today so not sure what’s the issue.

data[“crc_response”] = [get_model_response(sentence, crc_prompt_template) for sentence in tqdm(data[“Text”], desc=“Processing CRC tags”)]
Request failed: 500 {“error”: “Model too busy, unable to get response in less than 60 second(s)”}
Request failed: 500 {“error”: “Model too busy, unable to get response in less than 60 second(s)”}
Processing CRC tags: 0%| | 2/400 [01:02<4:03:19, 36.68s/it]
Processing CRC tags: 4%|▎ | 14/400 [02:55<2:14:16, 20.87s/it]
Processing CRC tags: 6%|▌ | 22/400 [03:10<23:11, 3.68s/it]

Ed

1 Like

I have recently started using this service, it didnt work for me yet. I am sure the URL (https://api-inference.huggingface.co/models/bigcode/starcoder/ )seem to be correct, guess something to do with server capacity.
Are there any other options to make it work ? Anyone please suggest.

Error:
500 - “{"error": "Model too busy, unable to get response in less than 60 second(s)"}”

1 Like

SMH , i dont think i can really accept the answers ive been observing.
with all due respect to everyone , from the bottom of my heart, i understand we’re all in the same boat,
these issues stand out to me and have been for some time

(0. python, and gradio, and HF, are NOT my first languages, im amazed i even imported gradio)

  1. Is there any staff here? what is the purpose of HF and to whom are they catering ?

  2. im seeing 500, 504 , 404 (and a small amount of others) with thousands of image gen models.
    but when i change or shift my load anywhere between 3 - 30 , I SWEAR (but havent confirmed to absolutes) that my 500 and 504 errors will shift over many of the models
    and those that havent loaded will, and those that do, wont

but 404 errors, from the models of a SPECIFIC user cough cough,
(for the record , Im not blaming ANY of us,
especially you SPECIFIC user.
I FULLY appreciate all it is that you do for everyone here)
–remain

token or not
and only for that specific user’s models

  1. i think HF new 3rd party inference service rollout’s timeline suggests it the culprit to blame, but i can really only accept that when it comes to the 404 errors
    not so terribly much on all the others
    because of the tendency they have to shift , when i shift my workload

  2. ive been here bout 300 days, ive seen HF has its server side ups and downs, ive faught and cried over threads and ques and resets , ive come to terms with all these struggles

but suddenly,
a harsh and long, 2 week connection disruption
with only the hint of "its server side, we must wait "
and then finding they’ve actually SCHEUDALED this 3rd party service rollout
(only indicator of that was their little blue NEW tag on it)
then i can ONLY sit back and say

so that leaves question 1 as the only important question
i would assume since the answer isnt US or they’d be here or they would have informed us better, that maybe the 3rd parties are the answer

considering HF has just sat on this “its a server issue” issue, im thinking the 3rd parties are loosing interest as are many of us

and of course #0, but thats soley in my court

if all im getting are 404 and 500 errors
for a month
and a blank screen

then my ignorance surely will floweth over eventually

(this is just really a prayer for some hope or some help.
i usually avoid forums and obtain my own solutions)

so if you’re there and made it here,
all i can say with certainty at this point
is
thank you for listening

and i guess we should just wait another 2 weeks?
(throws all my paper work into the air above me)

1 Like

@julien-c I know that I shouldn’t normally be mentioning you, but please reassure us…:downcast_face_with_sweat: