Dear HF forum,
I am planning to finetune Flan-t5.
However for my task I need a longer seq length (2048 tokens).
The model has a max token length of 512 currently.
According to related posts on the topic I understand T5 uses relative position embeddings and so can handle longer seq lengths in principle.
However will finetuning this to a longer seq length than it was trained on result in a sub par model?
Thank you
Anuj