I am trying to fine tune a whisper model using this source: Fine-Tune Whisper For Multilingual ASR with 🤗 Transformers
I want to modify the loss function used to fine tune it. For example, I would like to modify the loss function to be able to distill knowledge from another ASR model.
So how do I modify the loss function and how would I do the knowledge distillation part as well?
@sanchit-gandhi Would appreciate it if you can help!