Training using multiple GPUs

vblagoje · October 21, 2020, 2:21pm

@sgugger I am using Trainer classes but not seeing any major speedup in training if I use a multi-GPU setup. In nvidia-smi and the W&B dashboard, I can see that both GPUs are being used. I then launched the training script on a single-GPU for comparison. The training commands are exactly the same on both machines.

I do not see any significant speedup in training. The training lasts for hours, I didn’t wait till the end, but tqdm estimates are pretty much the same on both machines. The progress should be reflected properly in tqdm, right? Any suggestions for further diagnosis?

Topic		Replies	Views
Finetuning GPT2 using Multiple GPU and Trainer 🤗Transformers	14	6791	May 22, 2023
Custom model with two pretrained models fails multi GPU training when using the Trainer 🤗Transformers	0	245	March 2, 2023
Multiple gpu training 🤗Transformers	1	2588	August 10, 2024
Which method is use HF Trainer with multiple GPU? 🤗Transformers	4	1564	June 19, 2023
How to run single-node, multi-GPU training with HF Trainer? 🤗Transformers	5	15224	October 16, 2024

Training using multiple GPUs

Related topics