How to apply SpecAugment to a Whisper?

mehran · May 20, 2023, 3:55am

I’m trying to learn how I can apply SpecAugment when fine tuning a Whisper model. First of all, I’m not sure if applying such augmentation is common practice for fine tuning or not (maybe it’s only used for training?). And second, I don’t even know how to implement it.

I see that PyTorch has a library implementing the technique. But I guess I need to take into account the mask for each sample before using that library. And I don’t know how to do that.

I appreciate any guidance I can get. Thanks.

Kongda · July 25, 2023, 5:31am

HI, I share the same question as well. Have you managed to sovle this issue?

mehran · August 13, 2023, 8:37pm

I’m not if this is the right way to do it but it seem to work:

from transformers import WhisperForConditionalGeneration

model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")

model.config.forced_decoder_ids = None
model.config.suppress_tokens = []

model.config.apply_spec_augment = True
model.config.mask_time_prob = 0.05
model.config.mask_feature_prob = 0.05

The last three lines enable the SpecAug for the Whisper model. The right values to set need to be researched.

anahar · February 9, 2024, 7:43am

Hey @mehran have you observed any noticeable improvements in performance metrics after applying the SpecAugment?

Topic		Replies	Views
Problems tracing fine tuned whisper model to torchscript Beginners	1	395	June 27, 2024
Fine-tuning Whisper for Audio Classification Models	6	3244	November 8, 2024
Evaluating performance before and after fine-tuning Beginners	1	24	March 20, 2025
How to fine-tune whisper on unsupported language? Beginners	1	166	October 12, 2024
Korean finetuning on Whisper Beginners	1	1595	February 25, 2024

How to apply SpecAugment to a Whisper?

Related topics