Finetuned MT5 model generating the same first token for any input

gp06aug · May 9, 2023, 12:42am

Hi All,
I finetuned an MT5-base checkpoint for a natural language to BASH translation task and notice the model is generating the same first token for any input. MBART on the same data doesn’t have this problem and generates expected results. Any idea what could be going wrong? Is there any issue with MT5 checkpoints, I earlier had to remove the mixed precision setting for it to work properly.

Topic		Replies	Views
Google/MT5 model: While generating always starts with the same token, after `<pad>` 🤗Transformers	0	345	May 8, 2023
Force mBART to generate tokens in target language during backtranslation Models	0	493	March 22, 2021
mT5/T5v1.1 Fine-Tuning Results Models	16	7511	March 8, 2022
Mbart finetuning Models	0	679	July 29, 2021
MT5 Decoding in wrong language 🤗Transformers	0	247	July 23, 2021

Finetuned MT5 model generating the same first token for any input

Related topics