Linear Learning Rate Warmup with step-decay

adaptivedecay · April 19, 2021, 5:00am

Hey Guys,

I am new to PyTorch and have to train deep models like ResNet-, VGG- which requires the following learning rate schedule:

Linearly increase the learning rate from 0 to ‘initial_lr’ in the first k training steps/iterations
Continue with ‘initial_lr’ for the next ‘m’ training steps
Decay the learning rate in a step-decay manner. For example, say after 30th epoch, you reduce the ‘initial_lr’ by 10. And after 45th epoch, you further reduce it by 10 for any further training.

This can be better visualized using the following picture:

This is an example using LeNet-300-100 on MNIST with TensorFlow2.

How can I achieve this particular learning rate schedule with ‘huggingface’?

Thanks

lewtun · April 19, 2021, 7:42am

hey @adaptivedecay, you can define your own scheduler for the learning rate by subclassing Trainer and overriding the create_scheduler function to include your logic: Trainer — transformers 4.5.0.dev0 documentation

alternatively, you can pass the optimizer and scheduler as a tuple in the optimizers argument.

adaptivedecay · April 19, 2021, 7:59am

Hi @lewtun, any sample code would be helpful?

lewtun · April 19, 2021, 8:21am

adaptivedecay · April 21, 2021, 1:22pm

Thanks for your link. I will have a look into it and get back!

Topic		Replies	Views
Use torch.optim.lr_scheduler.CyclicLR with Trainer 🤗Transformers	0	425	May 12, 2023
How to create the warmup and decay from the BERT/Roberta papers? 🤗Transformers	2	7430	November 18, 2020
How to adjust the learning rate after N number of epochs? Beginners	1	781	August 10, 2021
Is there an easy way to apply layer-wise decaying learning rate in huggingface trainer for RobertaMaskedForLM? Research	3	2956	April 5, 2022
Optimizer returned by create_optimizer function Beginners	0	254	August 21, 2022