I observe a very long time before the training actually starts once Trainer.train
being called.
It appears it comes from LengthGroupedSampler
used when setting group_by_length
to True
.
Is there a way to use multiple workers to accelerate this process?