Linear Learning Rate Warmup with step-decay

Hey Guys,

I am new to PyTorch and have to train deep models like ResNet-, VGG- which requires the following learning rate schedule:

  1. Linearly increase the learning rate from 0 to ‘initial_lr’ in the first k training steps/iterations

  2. Continue with ‘initial_lr’ for the next ‘m’ training steps

  3. Decay the learning rate in a step-decay manner. For example, say after 30th epoch, you reduce the ‘initial_lr’ by 10. And after 45th epoch, you further reduce it by 10 for any further training.

This can be better visualized using the following picture:

This is an example using LeNet-300-100 on MNIST with TensorFlow2.

How can I achieve this particular learning rate schedule with ‘huggingface’?


hey @adaptivedecay, you can define your own scheduler for the learning rate by subclassing Trainer and overriding the create_scheduler function to include your logic: Trainer — transformers 4.5.0.dev0 documentation

alternatively, you can pass the optimizer and scheduler as a tuple in the optimizers argument.

Hi @lewtun, any sample code would be helpful?

you can take a look at the implementation in the Trainer here: transformers/ at d9c62047a8d75e18d2849d345ab3394875a712ef · huggingface/transformers · GitHub

1 Like

Thanks for your link. I will have a look into it and get back!

1 Like