Multi gpu training

Both are supported by the Hugging Face Trainer. You just have to use the pytorch launcher to use DistributedDataParallel, see an example here.