How to add a custom objective function based on the generated target sentence tokens from a T5 model during training?

sampreethsarma95 · March 17, 2023, 6:08pm

I am trying to do a specific experiment with an input sequence and an expected target sequence, for which I use the T5ForConditionalTextGeneration model. Here is an example of my input and output formats:

Input Text: "Michael Jordan is a professor at Berkeley."
Target Text: "2 Entities [Entity1] Michael Jordan is a Person [Entity2] Berkeley is a Place"

I use this target format to achieve two things:

1. the model has to recognize how many entities the given input text has
2. generate the respective number of entities it predicted and their type.

Although the main overall objective here would be to generate the exact target sequence, I want to ensure that my model has to do the above 2 tasks perfectly. That is,

	Objective 1: It has to predict the correct number of entities in a given text
	Objective 2: It has to generate the number of entities that it predicted the sentence might have
	Objective 3: It has to generate a sequence as close as possible to the expected target sequence

Objective 3 will be taken care of by the CrossEntropyLoss that the T5 model computes. But how to add Objectives 1 and 2 to that loss? I tried the following for Objective 1:

	lm_labels = batch["target_ids"]
	lm_labels[lm_labels[:, :] == self.tokenizer.pad_token_id] = -100
	t5_outputs = self.model(
		            input_ids,
		            attention_mask=attention_mask,
		            decoder_input_ids=decoder_input_ids,
		            decoder_attention_mask=decoder_attention_mask,
		            labels=labels,
		        )
	loss = t5_outputs[0]
	lm_logits = t5_outputs[1]
	pred_target_ids = torch.argmax(lm_logits, dim=2)
	pred_ent_count_ids = pred_target_ids[:, :3] # to take the first 3 tokens corresponding to the string "2 Entities" from the logits
	gold_ent_count_ids = lm_labels[:, :3]  # to take the first 3 tokens corresponding to the string "2 Entities" from the gold target sentence
	ent_count_ids_loss_fct = CrossEntropyLoss(ignore_index=-100)
	ent_count_ids_loss = ent_count_ids_loss_fct(pred_ent_count_ids.float(), gold_ent_count_ids.float())
	loss += ent_count_ids_loss

My questions:
1. is that the correct way to compute loss for the number of entities?
2. How can we compute a loss if the number of predicted entities is mismatched with the actual generated [Entity X]? The model has to be penalized if it says that there are “3 Entities” and generated only up to “[Entity2]” or generated more than “[Entity 3]” such as “[Entity 4]”.

Topic		Replies	Views
How to add a custom objective function based on the generated target sentence tokens from T5 model during training? Beginners	0	214	March 17, 2023
What is loss function for T5 Models	13	12933	February 25, 2024
T5 for conditional generation: getting started Beginners	20	18672	July 19, 2023
Question regarding T5ForConditionalGeneraton loss in the example Beginners	0	323	January 4, 2021
T5 model fine-tuning in the stsb dataset generates wrong outputs 🤗Transformers	2	932	September 21, 2022

How to add a custom objective function based on the generated target sentence tokens from a T5 model during training?

Related topics