How to train transformer (seq-to-seq) for very large seq?

seyeeet · October 4, 2021, 1:38pm

I have a seq-to-seq task but my input seq is super large (it has dimension of 10k tokens) the out put seq is normal though (less than 512).
I notice that normal transformer does not work very well for my case.

I was doing it in autoregressive style if it helps also.

I was wondering if folks here have any suggestion on how to do it?

Thanks

Topic		Replies	Views
Best models for seq2seq tasks 🤗Transformers	3	1129	August 16, 2020
How to train a seq2seq pretrained model (CodeT5p) with large sentences having more than 512 size from input side and output side also? Models	0	160	May 21, 2024
How to train with very long sequences? Beginners	2	688	May 20, 2022
Seq-2-Seq Predictions for Longer Sequences and Question for compute metrics function Beginners	0	454	December 16, 2021
Fine tune seq2seq with multiple output Beginners	0	636	January 19, 2021

How to train transformer (seq-to-seq) for very large seq?

Related topics