Predicting only " " after training (S2T) Wav2Vec2CTC

My work so far:

So I have copied the code from Fine-Tune XLSR-Wav2Vec2 for low-resource ASR with 🤗 Transformers and tried to implement things separately.
But, after I run trainer.train and try to get the predictions, my model just predicts “” empty strings.
So, what does trainer.train do to my model? It should update the weights so as to increase performance, but it apparently is deleting the model…

Can someone help me please?

I have the same issue, the loss is nan and after 1 epoch the model predicts empty strings
please, have you found the root of the issue? thanks.

+1 gettting the same issue when trying to implement data processing, training, and evaluation separately.