Arabic ASR: Fine-Tuning Wav2Vec2


I’m planning on fine-tuning XLSR-Wav2Vec2 in Arabic using Common Voice. I already made code changes to transformers examples that should help with Arabic (e.g., Buckwalter orthography): transformers/examples/research_projects/wav2vec2 at master · huggingface/transformers · GitHub

You may also start with my pre-trained model: elgeish/wav2vec2-large-xlsr-53-arabic · Hugging Face – it should be easy to tweak the sample script to transliterate into Buckwalter (to match its vocab). I’m happy to help and collaborate. Good luck and happy fine-tuning!

Interested! I’d like to fine tune on the EveryAyah and Tarteel Quranic Audio dataset.

Should be interesting to see results with noisy + recited (Mujawwad) arabic.

I’m still learning how to use HuggingFace datasets, but once I wrap my head around this, I can publish an EveryAyah dataset.