I’m training with the official code code to finetune a T5-base on my dataset (seq2seq).
After 3 epochs, the train loss go to zero, meanwhile the eval loss it’s only near to zero. My model does not fit well on the test-set, so what can I do to avoid this zero-loss in the training? Can I change the loss function? Thanks.