I am running the sample code [Text classification](https://Text classification)
When I set num_train_epochs to 5 the loss is:
Step | Training Loss |
---|---|
500 | 0.320200 |
1000 | 0.246800 |
1500 | 0.230600 |
2000 | 0.171200 |
2500 | 0.160800 |
3000 | 0.152400 |
3500 | 0.102400 |
4000 | 0.085700 |
4500 | 0.098600 |
5000 | 0.066400 |
5500 | 0.050800 |
6000 | 0.045400 |
6500 | 0.033500 |
7000 | 0.030500 |
7500 | 0.030600 |
But when I set num_train_epochs to 1 the loss is:
Step | Training Loss |
---|---|
500 | 0.338900 |
1000 | 0.242900 |
1500 | 0.212500 |
If I am running again with num_train_epochs to 1 I am getting the exactly same loss:
Step | Training Loss |
---|---|
500 | 0.338900 |
1000 | 0.242900 |
1500 | 0.212500 |
So my questions is: Why different num_train_epochs gives different loss? How to make it consistent? e.g., how to make that on num_train_epochs = 1 I will receive the same 3 losses as in num_train_epochs = 5