What algorithm Trainer uses for multi GPU training (without torchrun)

It uses DP if you launch the script with python and DDP if you launch it with torchrun. FSDP will be ignored if you don’t launch it with torchrun.

1 Like