Huggingface token usage for routed requests for a custom provider

mintyleaf · June 26, 2025, 6:07pm

Hello!
Currently i am working on inference provider integration, and seems like there is a lack of documentation on the routed requests specific tokens
I have read this doc ofc - How to be registered as an inference provider on the Hub?

So, the question is - when user is doing the routed request to our backend - the HF token is used
From our side - we need to validate that token (whoami-v2 afaik), and just pass the request to inference backend, creating the random UUID Inference-ID header response, and storing that request for billing endpoint later
I see the opportunity to simply abuse HF token usage, when just by passing valid HF token to our backend multiple “free” non-billed requests can be made
I don’t see any prevention mechanism in the docs, nor the other routed requests info (like additional token, or so)
And the only way from there - check if request isn’t billed for that token in some time window, and ban it if not
Yet, that is kinda overcomplicated way to do such stuff

Is there anything i missing?

Topic		Replies	Views
HF Inference Endpoints Error 429 Inference Endpoints on the Hub	2	65	March 27, 2025
HF Inference Usage via organization Intermediate	4	53	April 3, 2025
Inference provider request Inference Endpoints on the Hub	2	30	April 9, 2025
Inference endpoint data privacy Inference Endpoints on the Hub	5	3583	April 17, 2023
Hosted Inference API Beginners	0	842	March 6, 2023

Huggingface token usage for routed requests for a custom provider

Related topics