Can accelerator handle the distributed sampler?

As far as I know, for Pytorch, RandomSampler can not be directly used in the distributed data parallel training since DistributedSampler is desired (this link discusses the problem). I am wondering whether accelerator.prepare(dataloader) handles the data split for multiple GPUs if I use the RandomSampler, so that the sub-dataset on each device are exclusive.

You don’t have to worry about using a distributed sampler with Accelerate. Whatever your sampler is, Accelerate will automatically shard it for all processes.

That’s great! Thanks!