Seasoned Front End Dev with background in C++. New to Python, Docker, WhisperAI. Using an RTX-3080 and Ryzen 5800X with 32GB or RAM for development.
Got a Docker image/container of WhisperX going that I can tap for transcriptions.[WhisperX Docker Images]
It works really well but there’s something I’m curious about. Whisper takes about 10 seconds to start transcribing each time I send over an audio file (using the medium model).
Any way to mitigate those 10 seconds? Via Docker? Via built in functionality of Whisper/WhisperX?
Wish there was a way to keep those startup scripts spun up so that the next Whisper request didn’t have to start from zero.
Screenshot of Whisper’s output when initiating a transcription. It’s this startup process that I would like to know if it can be mitigated/reduced.