Multi GPU traning with Accelerator vs Trainer

Solmazp · July 31, 2024, 3:43pm

I am using accelarteor to train a model on multiple GTX 1080 GPU. It takes ~3 sec to process 128 samples (16 per each GPU). Even using A100 GPU., I am getting same speed. When I use Trainer module, I am getting faster processing only in one GPU. What can be the source of these differences ?

vlisaia · August 1, 2024, 12:32pm

trainer uses accelerate in the backend if available, they probably just optimized its use.

Chahnwoo · August 6, 2024, 12:33am

It’s also possible that the Trainer makes use of other libraries on top of accelerate. For example, libraries like flash attention can also have an impact on model processing speeds.

Topic		Replies	Views
Learning rate for the `Trainer` in a multi gpu setup 🤗Transformers	4	607	April 29, 2024
Dataloader fetches slowly using accelerator for distributed training 🤗Accelerate	0	1204	October 29, 2021
Decreasing performance when using Accelerate 🤗Accelerate	1	2253	March 8, 2022
Multi-GPU is slower than single GPU when running examples 🤗Accelerate	2	450	July 24, 2024
Single GPU is faster than multiple GPUs 🤗Accelerate	3	1921	January 31, 2024

Multi GPU traning with Accelerator vs Trainer

Related topics