Hey there,
What’s the recommended way for modeling longer sequences in NLP using BERT/BART type models?
Say on the order of ~10k+ tokens.
Tasks are potentially autoregressive, and won’t have a fixed size output vector.
Thanks!
Hey there,
What’s the recommended way for modeling longer sequences in NLP using BERT/BART type models?
Say on the order of ~10k+ tokens.
Tasks are potentially autoregressive, and won’t have a fixed size output vector.
Thanks!