How to modify loss function in a seq2seq trainer?

dmehta01 · August 30, 2024, 3:15pm

I am trying to fine tune a whisper model using this source: Fine-Tune Whisper For Multilingual ASR with 🤗 Transformers

I want to modify the loss function used to fine tune it. For example, I would like to modify the loss function to be able to distill knowledge from another ASR model.

So how do I modify the loss function and how would I do the knowledge distillation part as well?

@sanchit-gandhi Would appreciate it if you can help!

nielsr · August 31, 2024, 8:36am

Hi,

You can overwrite the Seq2SeqTrainer’s compute_loss method as shown here.

Topic		Replies	Views
Having troubel in understanding what loss is currently in use Beginners	1	766	November 24, 2023
Custom trainer does not work on multiple GPUs 🤗Transformers	1	1427	December 21, 2021
Custom Training Loss Function for Seq2Seq BART Beginners	1	1743	July 21, 2023
How to pass multiple datasets into Trainer for Knowledge distillation in NMT 🤗Transformers	3	335	May 9, 2024
Knowledge distillation for NER task 🤗Transformers	0	292	August 23, 2023

How to modify loss function in a seq2seq trainer?

Related topics