Can accelerator handle the distributed sampler?

ezio98 · December 18, 2021, 10:25am

As far as I know, for Pytorch, RandomSampler can not be directly used in the distributed data parallel training since DistributedSampler is desired (this link discusses the problem). I am wondering whether accelerator.prepare(dataloader) handles the data split for multiple GPUs if I use the RandomSampler, so that the sub-dataset on each device are exclusive.

sgugger · December 20, 2021, 1:53pm

You don’t have to worry about using a distributed sampler with Accelerate. Whatever your sampler is, Accelerate will automatically shard it for all processes.

ezio98 · December 21, 2021, 1:47am

That’s great! Thanks!

Topic		Replies	Views
Using DistributedSampler with accelerate 🤗Transformers	4	495	April 2, 2025
Accelerate - WeightedRandomSampler Dataloader Intermediate	1	311	June 18, 2024
DistributedSampler with Accelerate 🤗Accelerate	1	76	June 10, 2025
Accelerator .prepare() replaces custom DataLoader Sampler 🤗Accelerate	5	1415	March 9, 2025
Using Transformers with DistributedDataParallel — any examples? Intermediate	11	23723	May 8, 2023

Can accelerator handle the distributed sampler?

Related topics