Hi Guys! I’m using Azure Machine Learning Services for training a text classification model. But training the model on 10 GPU’s isnt as fast as I would expect it to be. First, I did a benchmark with 1 V100. It took 3 hours to run 5 epochs. Secondly, I did a run with 10 V100. It took 2 hours to ru…

Azure Machine Learning: Training isn't significant faster with multiple GPU's

Pepio3 February 2, 2023, 12:55pm 2

I have the same issue. The data seems to be distributed across multiple nodes, but the total training time does not decrease.

Topic		Replies	Views
Multiple GPUs do not speed up the training 🤗Accelerate	1	3462	January 26, 2022
Single GPU is faster than multiple GPUs 🤗Accelerate	3	2013	January 31, 2024
Time out for Multi node training on Google Cloud (GCP) 🤗Accelerate	2	896	September 9, 2023
What does "--multi_gpu" do under the hood? (and how to use it) 🤗Accelerate	7	6630	May 31, 2023
Why is the training time differ? 🤗Accelerate	1	322	June 25, 2024