TFT5ForConditionalGeneration with custom loss

BenM61 · April 4, 2022, 3:19pm

Hey, I want to fine-tune TFT5ForConditionalGeneration. Very similar to this thread, but I want to use the teacher forcing (i.e. use the forward, rather than generate method).

One thing that crossed my mind is to inherit from TFT5ForConditionalGeneration, override the forward with a forward exactly like the original (from the source code), but change the loss to my loss. Is that a good idea?

I will add more context below, but my question is how to do something else than CrossEntropyLoss for the t5 model loss.

Context:
My task is- given tokenized song lyrics, classify the genres of that song (multilabel classification).

The output I want from t5 is a string containing all the genres predicted, followed by eos token, and than padding tokens to keep all samples with equal length (some songs has more than one genre). For example

'FUNK,POP,,,</s> <pad> <pad> <pad> <pad> <pad>',

When I used the default t5 loss (training via forward), I decreased the loss during training but the eval loss is bad and it predicts nonsense. I think the model can just learn where to place the paddings to get a loss improvement and that’s not what I want.

I’m not sure how to implement differential loss while ignore the pads, but for now I only ask how to change the loss in general.

I can add the code if you want, Thanks

Topic		Replies	Views
T5Model predict <UNK> Beginners	0	222	October 5, 2022
T5 Seq2Seq custom fine-tuning Models	7	3713	November 30, 2020
How to use T5ForConditionalGeneration to train your custom model? Beginners	0	263	September 27, 2022
How to train TFT5ForConditionalGeneration model? 🤗Transformers	5	3329	November 21, 2020
Input format for T5 model in Question Answering task 🤗Transformers	0	747	February 3, 2023

TFT5ForConditionalGeneration with custom loss

Related topics