Just started doing some inference task through the API yesterday it worked quite fine today am getting:
Request failed: 500 {“error”: “Model too busy, unable to get response in less than 60 second(s)”}
although it keeps trying again but super slow - is there any issue with HF servers or something?
thanks,
Ed
HF_API_URL = “https://api-inference.huggingface.co/models/deepseek-ai/DeepSeek-R1-Distill-Qwen 32B”
headers = {
“Authorization”: f"Bearer {HF_TOKEN}",
“Content-Type”: “application/json”,
}
def get_model_response(sentence, prompt_template):
prompt = prompt_template.format(sentence=sentence)
payload = {
“inputs”: prompt,
“parameters”: {
“temperature”: 0.3,
“max_new_tokens”: 256,
“top_p”: 0.7,
“mask_instructions”: True
}
}
response = requests.post(HF_API_URL, headers=headers, json=payload)
if response.status_code == 200:
data = response.json()
try:
content = data[0]["generated_text"].strip()
return content
except (IndexError, KeyError):
return None
else:
print("Request failed:", response.status_code, response.text)
return None
1 Like
Same 500 here.
Qwen/Qwen2.5-72B-Instruct is responding normally, so it’s possible that DeepSeek R1 is just really busy…
The staff reported that something was wrong with the Hugging Face server or domain a few days ago and that it was being adjusted (details unknown).
1 Like
thanks for your response much appreciated.
I did try with the Qwen but i seem to be getting hit and miss the same error -
get_model_response(temptext, btl_prompt_template)
Request failed: 500 {“error”: “Model too busy, unable to get response in less than 60 second(s)”}
Is there anywhere to indicate if HF servers are down or something?
thanks,
1 Like
Is there anywhere to indicate if HF servers are down or something?
You can find information about the overall crash below, but “sometimes it works and sometimes it doesn’t” is barely working, so I don’t think it will be reflected.
thanks again - that seems to indicated the serves are up today so not sure what’s the issue.
data[“crc_response”] = [get_model_response(sentence, crc_prompt_template) for sentence in tqdm(data[“Text”], desc=“Processing CRC tags”)]
Request failed: 500 {“error”: “Model too busy, unable to get response in less than 60 second(s)”}
Request failed: 500 {“error”: “Model too busy, unable to get response in less than 60 second(s)”}
Processing CRC tags: 0%| | 2/400 [01:02<4:03:19, 36.68s/it]
Processing CRC tags: 4%|▎ | 14/400 [02:55<2:14:16, 20.87s/it]
Processing CRC tags: 6%|▌ | 22/400 [03:10<23:11, 3.68s/it]
Ed
1 Like