T5 - model.generate() issue

Rakshith291 · March 13, 2022, 8:28am

I have finetuned a T5-model on custom dataset, so
if I do
model(input_ids=inputs.input_ids, attention_mask=inputs.attention_mask, decoder_input_ids=batch[“decoder_input_ids”],labels=batch[“labels”],decoder_attention_mask=batch[“decoder_attention_mask”]).loss

I’m getting around 0.003 loss, so I assume that’s an indication model has been trained well.

But if I use model.generate()-> I’m getting a random or trash as output, I was wondering where i’m going wrong?

Please can anyone help me with this

Rakshith291 · March 14, 2022, 10:05am

It’s resolved now,

I specifically added decoded input ids in the argument, I assumed it will be right shifted while training but that’s not the case.
In order to right shift the target sequence, only labels should be provided in the argument

waxef · March 18, 2024, 12:26pm

Hi, I’m having the same issue.

I assume the solution here was to:

either remove the decoder_input_ids from the model forward call

model(input_ids=batch['input_ids'], attention_mask=batch['attention_mask'], labels=batch['labels'])

or manually shift_right the decoder_input_ids.

Do we also have to shift_right the decoder_attention_mask as well in this case if we were to pass decoder_input_ids and decoder_attention_mask to model?

Topic		Replies	Views
T5 Model Generate and Model Outputs Vastly Different Beginners	1	821	September 11, 2022
T5 forward pass versus generate, latter outputs non-sense Beginners	8	2907	March 25, 2021
T5 fine tuning, loss difference when using labels and decoder_input_ids 🤗Transformers	2	1178	October 12, 2020
How to set input to validate of T5 Model Intermediate	1	472	February 21, 2023
T5 models: About the decoder_input_ids argument Models	0	763	December 5, 2022

T5 - model.generate() issue

Related topics