Custom Training Set for Whisper - can it be < 30s clips?

extracounted · July 11, 2024, 1:29pm

I’m trying to finetune Whisper on some specific terminology, and I’m wondering if the clips can be less than 30 seconds for finetuning? I remember seeing something about it needing to be 30s exactly elsewhere but can’t find the info on it anymore.
Thank you

(Also, I’m using a metadata csv with the transcription and pathway to audio using: Create an audio dataset as my guide.)

Topic		Replies	Views
Fine tuning whisper on custom dataset Beginners	3	926	January 11, 2024
Whisper fine tuning on custom audio data Beginners	4	2705	February 15, 2025
Creating a new dataset Beginners	1	246	February 13, 2024
Duration of audio sequence ingested by Whisper Inference Endpoints on the Hub	2	1672	January 17, 2023
Long audio input for training? 🤗Transformers	0	224	July 20, 2023

Custom Training Set for Whisper - can it be < 30s clips?

Related topics