T5 - model.generate() issue

I have finetuned a T5-model on custom dataset, so
if I do
model(input_ids=inputs.input_ids, attention_mask=inputs.attention_mask, decoder_input_ids=batch[“decoder_input_ids”],labels=batch[“labels”],decoder_attention_mask=batch[“decoder_attention_mask”]).loss

I’m getting around 0.003 loss, so I assume that’s an indication model has been trained well.

But if I use model.generate()-> I’m getting a random or trash as output, I was wondering where i’m going wrong?

Please can anyone help me with this

It’s resolved now,

I specifically added decoded input ids in the argument, I assumed it will be right shifted while training but that’s not the case.
In order to right shift the target sequence, only labels should be provided in the argument

Hi, I’m having the same issue.

I assume the solution here was to:

either remove the decoder_input_ids from the model forward call

model(input_ids=batch['input_ids'], attention_mask=batch['attention_mask'], labels=batch['labels'])

or manually shift_right the decoder_input_ids.

Do we also have to shift_right the decoder_attention_mask as well in this case if we were to pass decoder_input_ids and decoder_attention_mask to model?