Which data parallel does trainer use? DP or DDP?

I try to search in the doc. But I didn’t find the answer anywhere.

Thank you

It depends if you launch your training script with python (in which case it will use DP) or python -m torch.distributed.launch (in which case it will use DDP).

1 Like