How to run single-node, multi-GPU training with HF Trainer?


I want to train Trainer scripts on single-node, multi-GPU setting.
Do I need to launch HF with a torch launcher (torch.distributed, torchX, torchrun, Ray Train, PTL etc) or can the HF Trainer alone use multiple GPUs without being launched by a third-party distributed launcher?

See the documentation on running scripts. :slight_smile:

I think the docs are insufficient. See my questions here: Using Transformers with DistributedDataParallel — any examples?