The Seq2SeqTrainer (as well as the standard Trainer) uses a PyTorch Sampler to shuffle the dataset. At each epoch, it does shuffle the dataset and it also groups the samples of roughly the same length size. You can find the Sampler definition here.
The Seq2SeqTrainer (as well as the standard Trainer) uses a PyTorch Sampler to shuffle the dataset. At each epoch, it does shuffle the dataset and it also groups the samples of roughly the same length size. You can find the Sampler definition here.