T5 for conditional generation: getting started

Hey all, I have been trying to finetune T5 on XSum and I am getting constant validation loss. It doesn’t change at all. The training loss varies a but doesn’t converge like it stays in the range [10.0, 12.0]. I tried many methods like creating my own nn.Module which compatible with Trainer(), etc but none worked.
Link to colab (first version where I used default Trainer()).

Can anyone share a colab link or wandb project for my reference?

Thanks!