Using Transformers with DistributedDataParallel — any examples?

Then you just need to properly launch your training script, see here.