How to negatively penalize a T5 model's generation (Cross entropy doesn't do the job)?

msamogh · November 19, 2022, 1:51am

I’m trying to fine-tune a T5 language model in such a way that it is explicitly penalized for generating slightly incorrect answers. But which loss function should I use?

I essentially want this loss function to output a large value if the probability assigned to that next “wrong” token is large.

("[SYS] Where do you want to go ? [USR] Hello I want to book a flight from Atlanta to New York [SEP] ", "location_from=atlanta [SEP] location_to=new york [EOS]")

But I also want to explicitly penalize it for wrong answers. So if I have a set of “negative training tuples” such as:

("[SYS] Where do you want to go ? [USR] Hello I want to book a flight from Atlanta to New York [SEP] ", "location_from=new york [SEP] location_to=atlanta [EOS]")

Topic		Replies	Views
What is loss function for T5 Models	13	12855	February 25, 2024
T5 user defined loss function Beginners	11	4787	September 23, 2020
T5 generate gibberish after finetune 10epochs Models	4	1571	March 2, 2022
How to restrict T5 model to generate tokens only from the input text? Intermediate	0	420	June 6, 2023
T5 Model Generate and Model Outputs Vastly Different Beginners	1	805	September 11, 2022

How to negatively penalize a T5 model's generation (Cross entropy doesn't do the job)?

Related topics