I have already trained it for Hindi and Bangla and it was working fine but when I am training on Gujarati and Telugu, the loss is becoming zero in 5K steps.
What could be the reason for the sudden drop in the loss? Can anyone suggest what could be the cause or how to debug such issue?
Any suggestions?
Usually, this means you are training on your validation data, so I’d triple-check your training set and validation set don’t contain the same texts, or that there is no leak from one to the other.
Then it’s just the model learning your training set. If you really want to know how it would fare on new data, you need to use a validation set and compute the loss on it.
@sgugger I’m using validation set too, but train loss becames zero in 4 epochs.I tried to continue the training with others 4 epochs with lower learning rate, but the train loss starts at zero, so I have no improvement.