Does anyone know what the pay-as-you-go per token pricing is for the inference providers, e.g. cerebras?
I don’t think there is a price list at the moment…
There are some testimonials.
If you want more reliable information, it’s quicker to contact the support team at Hugging Face. That’s the situation. billing@huggingface.co
API inference limit changed?
That’s ass.
How i’m going to know how much i’m going to spend with every provider/model if pricings aren’t transparent or listed somewhere?
It’s really a mystery. I don’t want to be too vague, but I don’t think there is a simple price list anyway.
For example, prices vary depending on the model, so it would be difficult to make a simple list…
For instance, a request to black-forest-labs/FLUX.1-dev that takes 10 seconds to complete on a GPU machine that costs $0.00012 per second to run, will be billed $0.0012.
But HF knows how much they will bill from me after making a request, so there are a pricing/process/return value which shows how much I was billed for that request.
That’s right. If HF makes one request and tests it, it should be possible to list it in some way…
Anyway, it’s psychologically difficult to buy items without price tags…
To understand better: my service passes on the inference value to my clients. These values need to be exact so I don’t have any losses. Therefore, I needed to know how much I will charge per input/output for each model…