Hugging Face Payment Error 402 & You've Exceeded Monthly Quota

Whenever I want generate a response from the api it shows payment error 402 and this link “https://huggingface.co/api/inference-proxy/hf-inference/models/Qwen/QwQ-32B/v1/chat/completions”. After tapping on the link it showing “Sorry, we can’t find the page you are looking for.” Also showing this promise error in console “You have exceeded your monthly included credits for Inference Providers. Subscribe to PRO to get 20x more monthly allowance.”, I haven’t use a bit. I create a new account for this type of problems. I’ve this problem for at least 2 months. I’m using js with hf inference where is the problem.

2 Likes
curl 'https://router.huggingface.co/hf-inference/models/Qwen/QwQ-32B/v1/chat/completions' \
-H 'Authorization: Bearer hf_xxxxxxxxxxxxxxxxxxxxxxxx' \
-H 'Content-Type: application/json' \
--data '{
    "model": "Qwen/QwQ-32B",
    "messages": [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	],
    "max_tokens": 500,
    "stream": false
}'

I think there have been changes to the URL of the request destination. Also, payment methods are still being prepared for now.

404 page error! and I want to use in a React app using js not curl.

1 Like
import { HfInference } from "@huggingface/inference";

const client = new HfInference("hf_xxxxxxxxxxxxxxxxxxxxxxxx");

const chatCompletion = await client.chatCompletion({
	model: "Qwen/QwQ-32B",
	messages: [
		{
			role: "user",
			content: "What is the capital of France?"
		}
	],
	provider: "hf-inference",
	max_tokens: 500,
});

console.log(chatCompletion.choices[0].message);

PRO user ran into the same error… any update?

It’s said that PRO users are allowed to use beyond the given free credits and will be billed accordingly. But the API requests just all failed since I used up all free credits. Tried changing inference providers but not work.

1 Like

I encountered the same error today. It seems that they are currently trying to fix it.

I encountered this as well and am waiting for a fix.

1 Like

Important update for Inference API quota.

Hi I have the same problem, I noticed that even though I belong to an enterprise organization, when I use smoleagents’ HfApiModel inference, it uses the credits of my free account

and not those of the organization
yet I read here
that it should automatically use those of the organization, where is the problem?

Thank you for you help!

1 Like

It’s probably a bug related to handling Enterprise tokens… @meganariley

thank you, is it in program to solve this issue? for us it’s important use the inference API in our company…
Thank you very much

1 Like

Well, since you’re a paid service user, come to think of it, there are other ways to contact them.

1 Like

Hi @alexman83 can you please make sure you’re billing the org in your request? You’ll run into this error message if you’re not passing "X-HF-Bill-To: my-org-name" as a header in your HTTP requests. More info here: Pricing and Billing.

2 Likes

Hi @meganariley , I am using this code, where have I to put that information?

llm_model = HfApiModel(model_id='Qwen/Qwen2.5-Coder-32B-Instruct')

agent = CodeAgent(
    tools=[retriever],
    model=llm_model,
    verbosity_level=2,
    additional_authorized_imports = ['pandas']
)

This is the error I get

Error in generating model output:
InferenceClient.chat_completion() got an unexpected keyword argument 'bill_to'
1 Like

Ok I understood, I saw the bug fix in github but pip install upgrade doesn’t load the latest version fixed and the parameter bill_to is still missing in the InferenceClientModel class

1 Like

Oh… We perhaps need:

pip install git+https://github.com/huggingface/huggingface_hub

Same issue here (not Enterprise user though). I added a payment method, regenerated a token.

The code beow:

from huggingface_hub import InferenceClient

client = InferenceClient(
provider=“together”,
api_key=“hf_…”,
)

completion = client.chat.completions.create(
model=“Qwen/Qwen2.5-7B-Instruct”,
messages=[
{
“role”: “user”,
“content”: “What is the capital of France?”
}
],
max_tokens=512,
)

print(completion.choices[0].message)

Gives:

HfHubHTTPError: 402 Client Error: Payment Required for url: https://router.huggingface.co/together/v1/chat/completions (Request ID: Root=1-68179a18-5ef7e70807d1281213af66e7;fa14f4b5-cce7-4c74-b75d-0ad03f18093c)

You have exceeded your monthly included credits for Inference Providers. Subscribe to PRO to get 20x more monthly included credits.

1 Like

This seems to happen in relation to token permission settings.