Why my simple Bert model for text classification could not learn anything?

Hi, I am facing the similar issue. And above solution of placing optimizer.zero_grad() didn’t resolve this. Any help is appreciated. Have invested couple of weeks to look into this but not getting why hugging faced model is not learning via pytorch code. While, it runs from trainer.train()