I want to pass my gratitude and appreciation regarding Inference Endpoints which is one of the most useful features in ML today available anywhere. I started a test of both Whisper and Llama small models running on T4 and A10 respectively which seems a perfect cost/benefit fit for those two models in an end to end speech to text to LLM to speech pipeline allowing you to speak to a LLM. Thanks and Kudos to the team!!!
Demo space with all in AI pipeline: 🐪DromeLlama🦙 Chat WhisperLangchain 🌟FAISS Embeddings - a Hugging Face Space by awacke1