I’m fine-tuning XLSR-Wav2Vec2 in Dhivehi (dv) using Common Voice. There’s 18 hours of validated data in Common voice.
I followed the notebook on Turkish and got WER 0.54 (Trained for 60 epochs). Changing learning rate to 5e04 did not improve WER but learning rate 1e04 improved slightly WER to 0.52. These are trained for 30 epochs.
Now I’m looking into how I can improve this.
Would love to collaborate, discuss or any help on improving.