DistributedSampler with Accelerate

I have noticed that using torch.utils.data.distributed.DistributedSampler as the sampler in the dataloader, accelerate changes its length after calling accelerate.prepare().

Following is how Im setting up the dataloader:

# build data loader
sampler = torch.utils.data.distributed.DistributedSampler(
    train_dataset, 
    shuffle=True,
)

self.train_loader = torch.utils.data.DataLoader(
    train_dataset,
    shuffle=(sampler is None),
    sampler=sampler,
    batch_size=10,
    drop_last=True,
    num_workers=8,
    pin_memory=True,
    persistent_workers=True,
    prefetch_factor=2,
)


...

# prepare it with accelerate
self.model, self.optimizer, self.train_loader, self.val_loader = self.accelerator.prepare(
    self.model, 
    self.optimizer, 
    self.train_loader, 
    self.val_loader,
)

Following is lengths of the dataset Im getting:

Before Calling accelerator.prepare

  • len(self.train_loader) = 12 (which is expected)

Before Calling accelerator.prepare

  • len(self.train_loader) = 1

Can someone suggest what the issue could be?

Accelerate version being used is 1.5.2

1 Like

This unsolved issue may have similar symptoms…