Linear Learning Rate Warmup with step-decay

hey @adaptivedecay, you can define your own scheduler for the learning rate by subclassing Trainer and overriding the create_scheduler function to include your logic: Trainer — transformers 4.5.0.dev0 documentation

alternatively, you can pass the optimizer and scheduler as a tuple in the optimizers argument.

1 Like