Multi-GPU Distributed Training using Accelerate on Windows

I am trying to use multi-gpu distributed training on a model using the Accelerate library. I have already setup my congifs using accelerate config and am using accelerate launch train.py but I keep getting the following errors:

raise RuntimeError("Distributed package doesn't have NCCL " "built in")
RuntimeError: Distributed package doesn't have NCCL built in
ERROR:torch.distributed.elastic.multiprocessing.api:failed

raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError

I am running this on a Windows system and I understand that NCCL is not available on Windows. Would appreciate if anyone could provide a workaround for Windows :smiley: