Hi All,
I finetuned an MT5-base checkpoint for a natural language to BASH translation task and notice the model is generating the same first token for any input. MBART on the same data doesn’t have this problem and generates expected results. Any idea what could be going wrong? Is there any issue with MT5 checkpoints, I earlier had to remove the mixed precision setting for it to work properly.