'T5' generates almost the same input!

I have a dataset containing 162 rows. The sample is like INPUT: ‘T4aN1M0 Stage IIIA disease’, OUTPUT: ‘STAGE:3a|T_prefix:|T:4a|N:1|M:0’. I have followed this summarization notebook from examples. I have tried training the model for multiple epochs( 3, 10, 15, 30 ) but the model can not capture the relationship between input and output. This is the output generated by the model. INPUT: ‘cT1bN3M1c stage IVB’ OUTPUT: ‘stage IVB stage stage cT1bN3M1c stage’. I would appreciate it if you could help me. Thanks in advance.


An interesting problem. Maybe you would provide more details. For example:

  1. the training dataset size?
  2. steps in your training?
  3. how training loss changes?