How to Ensure Each Process Reads Its Own Dataset and Trains Correctly When Using Trainer?

I’m using the Hugging Face Trainer for training, but my dataset is quite large, so I want each process to read its corresponding part of the dataset (I know this can be done through split_dataset_by_node or by manually handling rank and world size). However, I noticed that the trainer uses accelerate.prepare(), it wraps my DataLoader, causing it to still fetch data according to the rank. How can I resolve this issue?
Thank you in advance for your help!

1 Like