Wav2vec model train from scratch


I’m new to the field of automatic speech recognition. I have a research project where we try to make a speech to text translator for Romanian medics. I saw that there are many pre-trained models for different languages which people seem to fine-tune them.

I wanted to know if it’s possible to train wav2vec for a specific language from scratch. If the answer is yes, could somebody give me an example for one language?

You can pretrain it using this Script transformers/run_wav2vec2_pretraining_no_trainer.py at master · huggingface/transformers · GitHub

but except it could be really unstable to pretrain from scratch as it’s written in the readme