How to deploy fine-tuned llava model with Huggingface Inference and using vLLM?

I was going over this article (Deploy open LLMs with vLLM on Hugging Face Inference Endpoints) and it mentions that we need to have a custom container. I’m wondering if that’s a must-have or is it enough to just have custom dependencies in requirements.txt (Add custom Dependencies). Also any examples or instructions of deploying multi-modal models like llava with vLLM would help!