Long audio input for training?

chlorane · July 20, 2023, 7:00am

I’m using whisper for ASR training. Our ASR needs to input a long audio with more than 30sec for training. I tried to use your ASR pipeline and found it useful for inference, but I did not find anything related to training (like an output related to loss etc). How can we apply whisper model (even if we frozen its layers) to the training process using long audio?

Topic		Replies	Views
Support for ASR inference on longer audiofiles or on live transcription? 🤗Transformers	2	473	April 21, 2023
Custom Training Set for Whisper - can it be < 30s clips? Beginners	0	102	July 11, 2024
No output from ASR Pipeline using Whisper Beginners	1	1141	September 8, 2023
Fine tuning whisper on custom dataset Beginners	3	932	January 11, 2024
Whisper fine tuning on custom audio data Beginners	4	2723	February 15, 2025

Long audio input for training?

Related topics