Using Transformers with DistributedDataParallel — any examples?

In case that wasn’t clear this will do everything automatically:

python -m torch.distributed.launch --nproc_per_node 2 ~/src/main_debug.py

see details: How to run an end to end example of distributed data parallel with hugging face's trainer api (ideally on a single node multiple gpus)? - #4 by brando

2 Likes