How to get Accelerated Inference API for T5 models?

Hello @Narsil. If I understand well your answer, both CPU and GPU Accelerated Inference API are for paid plan (you call it “customer plan”, I’m right?) which are Pro Plan, Lab and Enterprise in the HF pricing page.

Contributor plan | Try Accelerated Inference API: CPU, no?

However, this not what is written in this HF pricing page. As you can see in the screen shot below, even a Contributor plan (I’ve got one at pierreguillou (Pierre Guillou)) can Try Accelerated Inference API: CPU.

As I did my first test with the T5 base model which is not optimized even in AWS SageMaker (check this post from @philschmid), I did another test with distilbert-base-uncased-distilled-squad. And, as for the T5 base model, this distilbert one is not CPU Accelerated through the Inference API.

You can check my Colab notebook HF_Inference_API.ipynb

About the the expression “customer plan” or “API customer”

I think the HF team should edit the paragraph Using CPU-Accelerated Inference (~10x speedup) with a clear definition (see screen shot below) and verify the HF pricing page, too.

Conclusion | Contributors on the HF model hub can not test CPU-Accelerated Inference API :frowning:

But after saying all that, the reality is that we (the model contributors on the HF model hub) can not test CPU-Accelerated Inference API. What a pity!

Note: I did not understand you last comment (see below the quote).

Also keep in mind as mentioned in the docs, that for customers we’re usually able to go beyond the default depending on the load and requirements.

1 Like