I have finetuned a T5-model on custom dataset, so
if I do
model(input_ids=inputs.input_ids, attention_mask=inputs.attention_mask, decoder_input_ids=batch[“decoder_input_ids”],labels=batch[“labels”],decoder_attention_mask=batch[“decoder_attention_mask”]).loss
I’m getting around 0.003 loss, so I assume that’s an indication model has been trained well.
But if I use model.generate()-> I’m getting a random or trash as output, I was wondering where i’m going wrong?
I specifically added decoded input ids in the argument, I assumed it will be right shifted while training but that’s not the case.
In order to right shift the target sequence, only labels should be provided in the argument
Do we also have to shift_right the decoder_attention_mask as well in this case if we were to pass decoder_input_ids and decoder_attention_mask to model?