I’m trying to learn how I can apply SpecAugment when fine tuning a Whisper model. First of all, I’m not sure if applying such augmentation is common practice for fine tuning or not (maybe it’s only used for training?). And second, I don’t even know how to implement it.
I see that PyTorch has a library implementing the technique. But I guess I need to take into account the mask for each sample before using that library. And I don’t know how to do that.