Learning rate for the `Trainer` in a multi gpu setup

muellerzr · April 29, 2024, 5:32pm

Not necessarily, because it’s a huristic that people recommend to do so, but it’s also recommended to test yourself at your discretion.

What’s really happening is the number of steps increases that we’re stepping the learning rate, so if you want the same LR from situation A to B you should try multiplying the learning rate.

However again: test yourself first. Sometimes it’s not necessary

Topic		Replies	Views
Learning Rate Scheduler Distributed Training 🤗Accelerate	6	2031	September 5, 2024
Multi-gpu training does not optimize as expected Beginners	1	442	February 26, 2024
Trainer class with Accelerate Beginners	2	27	September 22, 2024
Same number of optimizations steps with 1 GPU or 4 GPUs? 🤗Accelerate	0	329	March 11, 2023
Trainer is not using multiple GPUs in the DP setup Beginners	0	790	April 9, 2023

Learning rate for the `Trainer` in a multi gpu setup

Related topics