Huggingface Seq2SeqTrainer uses accelerate so it cannot be run with DDP?

liveseongho · January 6, 2024, 3:54pm

Literally. Huggingface Seq2SeqTrainer uses accelerate so it cannot be run with DDP?
As I know in here, Accelerate is not compatible with DDP. DDP shows n_gpu=1 for every process (even if I use multiple GPUs), but accelerate requires multiple devices…

I got following error message.

ValueError: DistributedDataParallel device_ids and output_device arguments only work with single-device/multiple-device GPU modules or CPU modules, but got device_ids [0], output_device 0, and module parameters {device(type='cuda', index=0), device(type='cuda', index=1), device(type='cuda', index=2), device(type='cuda', index=3)}.

How can I avoid to use accelerate in Huggingface Seq2SeqTrainer, as I must use DDP.

marcsun13 · January 24, 2024, 4:30pm

Accelerate is totally compatible with DDP. Can you provide a reproducer ? You are probably using it wrong. See more info here.

Topic		Replies	Views
Seq2SeqTrainer multiple GPUs 🤗Transformers	2	101	January 22, 2025
Trainer default distributed training behaviour 🤗Transformers	2	26	May 15, 2025
How to use FSDP or DDP with Seq2SeqTrainer? 🤗Transformers	0	980	May 22, 2023
Accelerate config in Seq2SeqTrainer 🤗Accelerate	0	147	June 17, 2024
Hugging Face and Distributed Training: DDP/DP Implementation Help Needed Intermediate	0	514	February 14, 2024

Huggingface Seq2SeqTrainer uses accelerate so it cannot be run with DDP?

Related topics