T5 user defined loss function

chrisdoyleIE · August 6, 2020, 4:56pm

Hi,

Just a tip to save you some hassle in the event that you did not already know what I’m about to say.

You’re going to hit a snag in your idea here if you try to pass gradients from this new loss, but of course it is fine for a logging metric.

Gradients cannot flow through a sampling method such as arg max, beam search, or nucleus sampling because the function is non-differentiable. If you train your model with this loss, it will have no bearing on your results.

loss = diversity_loss + lm_loss
loss.backward() # gradients for diversity_loss will all be zero, but your model will still train, so be careful, it is not impacting your training whatsoever!

Topic		Replies	Views
TFT5ForConditionalGeneration with custom loss Beginners	0	451	April 4, 2022
Cross Entropy Loss and loss of HuggingFace T5ForConditionalGeneration does not matches 🤗Transformers	11	5294	November 29, 2023
Question regarding T5ForConditionalGeneraton loss in the example Beginners	0	323	January 4, 2021
T5 for conditional generation: getting started Beginners	20	18677	July 19, 2023
How to output loss from model.generate()? 🤗Transformers	16	6057	January 7, 2025

T5 user defined loss function

Related topics