Literally. Huggingface Seq2SeqTrainer uses accelerate so it cannot be run with DDP?
As I know in here, Accelerate is not compatible with DDP. DDP shows n_gpu=1 for every process (even if I use multiple GPUs), but accelerate requires multiple devices…
I got following error message.
ValueError: DistributedDataParallel device_ids and output_device arguments only work with single-device/multiple-device GPU modules or CPU modules, but got device_ids [0], output_device 0, and module parameters {device(type='cuda', index=0), device(type='cuda', index=1), device(type='cuda', index=2), device(type='cuda', index=3)}.
How can I avoid to use accelerate in Huggingface Seq2SeqTrainer, as I must use DDP.