Trainer being very slow to init training setting group_by_length to True

I observe a very long time before the training actually starts once Trainer.train being called.
It appears it comes from LengthGroupedSampler used when setting group_by_length to True.

Is there a way to use multiple workers to accelerate this process?

1 Like

Hey, did you manage to solve this?

1 Like