Why such a learning rate value?

I resumed training from checkpoint. I set the learning rate in TrainingArguments to 5e-5. Now the learning rate in the first logging step is 2.38e-05. Its value decreases in subsequent steps. How can I set the learning rate to the desired value? I do not understand where this 2.38e-05 comes from.

These are my training arguments.

training_args = Seq2SeqTrainingArguments(
        output_dir=output_dir,
        num_train_epochs=8,
        max_steps=-1,
        evaluation_strategy='epoch',
        eval_steps=0,
        per_device_train_batch_size=8,
        per_device_eval_batch_size=8,
        learning_rate=5e-5,
        warmup_ratio=0.1,
        warmup_steps=0,
        logging_dir=None,
        logging_strategy='steps',
        logging_steps=50,
        disable_tqdm=disable_tqdm,
        save_strategy='epoch',
        save_steps=0,
        load_best_model_at_end=True,
        metric_for_best_model='eval_loss',
        seed=random_state,
        predict_with_generate=True,
        dataloader_num_workers=4,
        save_total_limit=10,
    )

The scheduler used by default is a linear decay, so that’s why you see this learning rate, since you’re logging after 50 steps.

Here I have the first few logs every 50 steps. The learning rate value doesn’t vary much between these steps, so I assume the first step wasn’t with a value of 5e-5.

{'loss': 0.724, 'learning_rate': 2.3809441323160455e-05, 'epoch': 8.0}
{'loss': 0.5776, 'learning_rate': 2.3809028891343685e-05, 'epoch': 8.0}
{'loss': 0.6006, 'learning_rate': 2.3808616459526912e-05, 'epoch': 8.0}
{'loss': 0.6058, 'learning_rate': 2.3808204027710142e-05, 'epoch': 8.0}
{'loss': 0.5938, 'learning_rate': 2.3807791595893365e-05, 'epoch': 8.0}
{'loss': 0.6377, 'learning_rate': 2.3807379164076595e-05, 'epoch': 8.0}
{'loss': 0.5863, 'learning_rate': 2.3806966732259825e-05, 'epoch': 8.0}
{'loss': 0.5971, 'learning_rate': 2.380655430044305e-05, 'epoch': 8.0}
{'loss': 0.6842, 'learning_rate': 2.380614186862628e-05, 'epoch': 8.0}
{'loss': 0.6386, 'learning_rate': 2.3805729436809508e-05, 'epoch': 8.0}
{'loss': 0.6297, 'learning_rate': 2.3805317004992734e-05, 'epoch': 8.0}
{'loss': 0.6817, 'learning_rate': 2.380490457317596e-05, 'epoch': 8.0}
1 Like

I am also having the same issue. There is a lot of confusion regarding the metrics.