Performance of the idefics-9b

I am using idefics-9b and running it on Google colab gpu machine. Without quantization, when I use this model for inference with basic prompt it’s taking more than 10 minutes before answering it.
Not sure why so and how I can optimised it