Performance of the idefics-9b

vipinbansal1 · July 16, 2024, 9:29am

I am using idefics-9b and running it on Google colab gpu machine. Without quantization, when I use this model for inference with basic prompt it’s taking more than 10 minutes before answering it.
Not sure why so and how I can optimised it

Topic		Replies	Views
Inference with falcon7b to generate essays in Google Colab? 🤗Transformers	0	294	July 23, 2023
Inference time in TGI quantization Intermediate	0	159	May 21, 2024
Speed expectations for production BERT models on CPU vs GPU? Beginners	1	2144	October 2, 2020
Increasing Response time for Gradio api Spaces	3	284	September 6, 2024
T5 inference performance Models	5	1558	March 8, 2022

Performance of the idefics-9b

Related topics