How to get Accelerated Inference API for T5 models?

pierreguillou · November 18, 2021, 12:15pm

Hello @Narsil. If I understand well your answer, both CPU and GPU Accelerated Inference API are for paid plan (you call it “customer plan”, I’m right?) which are Pro Plan, Lab and Enterprise in the HF pricing page.

Contributor plan | Try Accelerated Inference API: CPU, no?

However, this not what is written in this HF pricing page. As you can see in the screen shot below, even a Contributor plan (I’ve got one at pierreguillou (Pierre Guillou)) can Try Accelerated Inference API: CPU.

As I did my first test with the T5 base model which is not optimized even in AWS SageMaker (check this post from @philschmid), I did another test with distilbert-base-uncased-distilled-squad. And, as for the T5 base model, this distilbert one is not CPU Accelerated through the Inference API.

You can check my Colab notebook HF_Inference_API.ipynb

About the the expression “customer plan” or “API customer”

I think the HF team should edit the paragraph Using CPU-Accelerated Inference (~10x speedup) with a clear definition (see screen shot below) and verify the HF pricing page, too.

Conclusion | Contributors on the HF model hub can not test CPU-Accelerated Inference API

But after saying all that, the reality is that we (the model contributors on the HF model hub) can not test CPU-Accelerated Inference API. What a pity!

Note: I did not understand you last comment (see below the quote).

Also keep in mind as mentioned in the docs, that for customers we’re usually able to go beyond the default depending on the load and requirements.

Topic		Replies	Views
How to deploy a T5 model to AWS SageMaker for fast inference? Amazon SageMaker	13	5844	February 28, 2022
Boost inference speed of T5 models up to 5X & reduce the model size by 3X 🤗Transformers	2	5672	June 8, 2023
T5 inference performance Models	5	1601	March 8, 2022
How to make single-input inference faster? Create my own pipeline? 🤗Transformers	9	3985	August 26, 2021
Performance of hosted inference API Beginners	0	303	February 16, 2021

How to get Accelerated Inference API for T5 models?

Contributor plan | Try Accelerated Inference API: CPU, no?

About the the expression “customer plan” or “API customer”

Conclusion | Contributors on the HF model hub can not test CPU-Accelerated Inference API

Related topics