Huggingface LR Decay Schedulers Spend the first epoch w/ an LR of 0

I don’t know if this is intended, or if I’m doing something wrong, but it looks to me both in practice and from the code that the LR schedulers in Transformers will spend all of the first epoch with a LR of zero.

E.g., the polynomial decay scheduler
uses LambdaLR within Pytorch, which sets the LR to the initial LR multiplied by a decay factor determined from passing an integer epoch parameter (which starts at zero) to the lambda_lr function specified here which means that for all of epoch 0 the returned decay factor will be 0 as well so the LR will be set to zero.

I also see this in practice when using this scheduler in concert with a pytorch lightning model.

Is this intended behavior for some reason? Or am I using this wrong or is this a bug in the HF library or something? Any insight would be greatly appreciated.

The LR scheduler in lightning should be configured with interval='step' rather than 'epoch':
https://pytorch-lightning.readthedocs.io/en/stable/common/lightning_module.html?highlight=configure_optimizers#configure-optimizers