Hi, I need to deploy multiple models (image recognition, llama2, video transcript, etc), and I’m trying to find the cost for this. I have found the pricing of the different servers, but I don’t know how to calculate how many models can I run in the same machine (if it is possible to share a machine with different endpoints), for example, let’s say that I want to use Idefics model, how can I know the machine that I need? how much percentage of the machine is using? etc.
I will appreciate if somebody has some information about this topic.
Regards.