ASR on inference endpoints

ejmiddle · October 1, 2023, 7:37pm

Hi there,

I am a bit lost on which path to continue on the following task:
I want to use whisper large v2 on quite a bit of audio files. Current working options are simply the openai api (costly) and local inference with pipelines (slow). Another hopefully simple option would be using inference endpoints instead, based on the openai/whisper-large-v2 repo.

However, it seems like I cannot pass arguments there, like chunk_size, language etc. (at least this is what I understand from the documentation). So the questions are

Is it correct, that I actually cannot pass arguments if simply setting up the inference endpoint with openai/whisper-large-v2 repo? If wrong, how to do it?
If correct: What is an alternative? I find custom handlers as a possible solution, but I am a bit lost on what the logic then is. I would somehow have to combine the repositories philschmid/openai-whisper-endpoint and openai/whisper-large-v2. Doing so does not seem straightforward to me

Any suggestions on this?

Thank you so much
Andi

brianjking · February 11, 2024, 10:30pm

Did you ever figure this out? When I deploy on inference endpoints for Whisper, it never works. I’ve not had this issue before with other models.

Topic		Replies	Views
Support for ASR inference on longer audiofiles or on live transcription? 🤗Transformers	2	475	April 21, 2023
To create "Inference Endpoints" Beginners	0	120	January 15, 2024
Duration of audio sequence ingested by Whisper Inference Endpoints on the Hub	2	1679	January 17, 2023
How to run text to speech from inference endpoint given audio file url? Beginners	1	899	June 8, 2023
How to use Inference API to perform speech recognition Beginners	1	213	October 12, 2024

ASR on inference endpoints

Related topics