Request to Serverless Inference API failed with 400 status code

I’m consistently encountering issues with the HuggingFace Serverless API, specifically with the hf-inference provider. When running the sample code on the HuggingFace serverless deployment page, I receive the following error:

openai.BadRequestError: Error code: 400 - {'error': 'Not allowed to request v1/chat/completions for provider hf-inference'}

I tried following models, all with the same status code:

  • meta-llama/Llama-3.3-70B-Instruct
  • deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
  • deepseek-ai/DeepSeek-R1-Distill-Llama-70B

Does anyone have any ideas or suggestions on how to resolve this issue?

1 Like

Same here, some posts on Hub, and some posts on HF Discord.

1 Like

It looks like you’re encountering a 400 Bad Request error when trying to access the Serverless Inference API. This usually happens due to incorrect input, missing parameters, or issues with the API request format. I’d recommend double-checking the request payload, headers, and any required parameters. Also, ensure the API endpoint you’re hitting is correct and the authorization tokens, if any, are valid. Let me know if you’d like help troubleshooting further! Read more