Why different num_train_epochs give different results?

I am running the sample code [Text classification](https://Text classification)

When I set num_train_epochs to 5 the loss is:

Step Training Loss
500 0.320200
1000 0.246800
1500 0.230600
2000 0.171200
2500 0.160800
3000 0.152400
3500 0.102400
4000 0.085700
4500 0.098600
5000 0.066400
5500 0.050800
6000 0.045400
6500 0.033500
7000 0.030500
7500 0.030600

But when I set num_train_epochs to 1 the loss is:

Step Training Loss
500 0.338900
1000 0.242900
1500 0.212500

If I am running again with num_train_epochs to 1 I am getting the exactly same loss:

Step Training Loss
500 0.338900
1000 0.242900
1500 0.212500

So my questions is: Why different num_train_epochs gives different loss? How to make it consistent? e.g., how to make that on num_train_epochs = 1 I will receive the same 3 losses as in num_train_epochs = 5

What learning rate schedule are you using? Some LR schedules (eg cosine) decay the learning rate based on the total number of training steps. Eg it decays faster if you train for fewer epochs.