I am trying to generate summaries using t5-small with a maximum target length of 30. My original inputs are german PDF invoices. I run OCR and concatenate the words to create input text. My outputs should be the invoice numbers. However even after 3 days on a V100 I get exactly 200 token long summaries (since epoch 1 or 2 out of 300) and garbage results. Summaries look like someone shuffled the original words a little but they do contain the invoice number somewhere near to the start.
What might cause it to stick to 200 generated tokens?