fine-tune Pegasus with xsum using Colab but generation results have no difference

Hi. I tried to fine-tune pegasus large with xsum dataset using Colab (Pro). I was able to finish the fine-tuning with batch size 1, and 2000 epochs in about 40 minutes (larger batch size crashed colab). The working Colab notebook I used is shared at

However, the generated summary seems to be the same for the pegasus large model (google/pegasus-large · Hugging Face) and the fine-tuned model. But the generated result using pegasus xsum model (google/pegasus-xsum · Hugging Face) is different and much better.

The training loss is already 0 and I am not sure what I have done wrong. Any help and pointers are highly appreciated.