Summarization with mT5


I am trying to do summarization with mT5 and when I use the official summarization colab which uses seq2seq trainer, the model outputs trash. You can see my GitHub issue here. Do you have any ideas on how to proceed with summarization using mT5?

1 Like

disabling fp16 seems to solve the issue, but I would rather have fp16 because it doubles the training speed :slight_smile: