Duration of audio sequence ingested by Whisper

jmassot · January 17, 2023, 6:50pm

Hi colleagues,

I have an issue when using Whisper. It transcribes only around 30 seconds of audio. Is it a known limitation? How can I ask for the transcription of longer audio files?

Thanks

Best regards

Jerome

philschmid · January 17, 2023, 7:08pm

Can you please share how you deployed your version?

We published a blog post on how to deploy it, which includes examples with longer audio transcription: Managed Transcription with OpenAI Whisper and Hugging Face Inference Endpoints

jmassot · January 17, 2023, 7:23pm

Hi Phil,

thanks for your reply. I have just used the inference API (not a specific endpoint deployed for my private usage). Maybe it is the reason. Need I deploy my own endpoint instead?

Thanks

Best regards

Topic		Replies	Views
SpeechBrain EncoderDecoderASR transcribe_file() Runs out of Memory Models	0	500	April 17, 2022
Using inference api on model that returns an audio file Models	0	387	November 23, 2021
Hugging face model not transcribing the entire length of the audio file Beginners	0	533	August 7, 2023
Support for ASR inference on longer audiofiles or on live transcription? 🤗Transformers	2	484	April 21, 2023
Using inference api on espnet/kan-bayashi_ljspeech_vits model Beginners	0	394	November 27, 2021

Duration of audio sequence ingested by Whisper

Related topics