I’m pretraining my dataset on Longformer. My loss function quickly dropped to around 8, but it hasn’t decreased further even after a long time (about 12 hours). I’ve tried several lr schedular and decrease my lr, it seems like the same… Can someone tell me what to do to decrease the lose?
This is lr schedular: