Fine-tuning wav2vec2 loss explodes and then goes to zero after certain time-steps

I have been trying to fine tune facebook wav2vec2 using the instructions in the tutorials by @patrickvonplaten .
I have tried using different models, large, xlsr, and a bunch of others.
also I have tried all kinds of combinations of parameters and the same problem occurs.
I am training on an GeForce RTX 3080 gpu, I have tried several drivers yet the same problem still occurs.
tried chunking the dataset to see if there is a problem in the audios somewhere, yet the problem was not from the dataset.

the behavior that keeps repeating:
the training all fine and smooth then suddenly the loss value goes insanely high then keeps on grounding to zero. tried to keep it training after that but it didn’t help, nothing changed.
the attached image shows an example of one of the runs.

I’d really appreciate it if anyone can help.