How to ensure the dataset is shuffled for each epoch using Trainer and Datasets?

No, this is would be very bad practice so we don’t offer that option.

That would be the group_by_length argument.

Yes training will resume with the same shuffle, at the same point you were at the time of the save.

2 Likes