Training loss changes as we change learning rate

Hi,
I am training microsoft phi 2 using my own data set which is comprised of 2k data.
I have used the training parameters as shown below


training_arguments = TrainingArguments(
    output_dir= output_dir,
    num_train_epochs= 5,
    max_steps= -1,
    per_device_train_batch_size= 2,
    gradient_accumulation_steps= 1,
    optim="paged_adamw_32bit",
    save_strategy="steps",
    save_steps = 1500,
    eval_steps = 100,       ## it gets error
    evaluation_strategy="steps",
    logging_steps=100,
    logging_strategy="steps",
    learning_rate= 1e-5,
    report_to="tensorboard",   
    fp16=False,
    bf16=True,
)

Now I changed the epoch and learning rate and thought if i decrease the learning rate from 2e-4 to 1e-5 the training loss will go lower but it was not the case. Will show you the tensorboard graph. The blue one is where the learning rate is 2e-4 and pink is 1e-5.


So isn’t decreasing the learning rate decrease the training loss