I generated my own xlsr ASR model by fine tuning facebook/wav2vec2-xls-r-300m with about 150 hours of Turkish transcripted audio data. Currently I got about 18% WER.
I wanted to boost my model’s performance by adding an LM according to the steps described in
Boosting Wav2Vec2 with n-grams in 🤗 Transformers.
Although I did exactly as described in that blog, however, I cannot get any better performance.
Actually, when I use a model with LM I got worse performance.
It seems to me that LM does not work, or it can’t do any positive contribution to the output.
Could I be missing something while adding LM to the ASR model?
I will appreciate any suggestions or guidance on this issue.
Thanks in advance.