Speed up the prediction in transformers models

devpy079 · November 23, 2021, 12:03pm

Hi I am using couple of model from transformers, they work good on GPU but the performance on CPU is not that great.

I am using the Google Pegasus-Xsum model for summarization and it takes around 15 seconds to process result. Also I am using the parrot paraphrase library that also uses the T5 model in the backend, it also very slow on CPU takes around 5-7 seconds to generate the result.
Here is the link of both: google/pegasus-xsum · Hugging Face
Parrot Paraphrase: prithivida/parrot_paraphraser_on_T5 · Hugging Face

Any tips and suggestion to speed up the prediction in CPU, as there is limitation on my server currently…

Topic		Replies	Views
Improve the performance of model prediction of transformers model 🤗Transformers	3	2614	November 24, 2021
Google/pegasus-xsum for summerization is very slow Beginners	2	207	February 26, 2024
Optimize response time of model output 🤗Transformers	0	674	December 23, 2021
Simple Model to rewrite/paraphrase Beginners	7	322	March 19, 2025
Using XLA fast text generation with Pegasus models Intermediate	5	570	August 25, 2022

Speed up the prediction in transformers models

Related topics