I am running the sample code [Text classification](https://Text classification)
When I set num_train_epochs to 5 the loss is:
| Step | Training Loss |
|---|---|
| 500 | 0.320200 |
| 1000 | 0.246800 |
| 1500 | 0.230600 |
| 2000 | 0.171200 |
| 2500 | 0.160800 |
| 3000 | 0.152400 |
| 3500 | 0.102400 |
| 4000 | 0.085700 |
| 4500 | 0.098600 |
| 5000 | 0.066400 |
| 5500 | 0.050800 |
| 6000 | 0.045400 |
| 6500 | 0.033500 |
| 7000 | 0.030500 |
| 7500 | 0.030600 |
But when I set num_train_epochs to 1 the loss is:
| Step | Training Loss |
|---|---|
| 500 | 0.338900 |
| 1000 | 0.242900 |
| 1500 | 0.212500 |
If I am running again with num_train_epochs to 1 I am getting the exactly same loss:
| Step | Training Loss |
|---|---|
| 500 | 0.338900 |
| 1000 | 0.242900 |
| 1500 | 0.212500 |
So my questions is: Why different num_train_epochs gives different loss? How to make it consistent? e.g., how to make that on num_train_epochs = 1 I will receive the same 3 losses as in num_train_epochs = 5