Which data parallel does trainer use? DP or DDP?

John6666 · August 30, 2025, 2:42am

# single node, 2 GPUs
torchrun --nproc_per_node=2 train.py
# or
accelerate launch --num_processes=2 train.py

Topic		Replies	Views
Running a Trainer in DistributedDataParallel mode 🤗Transformers	1	1457	October 24, 2020
Using Transformers with DistributedDataParallel — any examples? Intermediate	11	23600	May 8, 2023
Multi gpu training 🤗Transformers	3	6038	April 24, 2022
Model's evaluation in DDP training is using only one GPU Beginners	1	1069	September 14, 2023
Trainer is not using multiple GPUs in the DP setup Beginners	0	831	April 9, 2023