Wav2vec fine-tuning with multiGPU

Related issue: Excessive GPU-GPU communication with GPT2 making multi-GPU training slow? 路 Issue #9371 路 huggingface/transformers 路 GitHub