I would like to train own ASR system where the environment is very noisy.
If someone has experience on the same topic it would be great to listen to you here.
I would like to train own ASR system where the environment is very noisy.
If someone has experience on the same topic it would be great to listen to you here.
I’d like to give some feedback from myself to the subject.
We have noised the Common Voice 10 with Dmytro Chaplynsky and I successfully trained a model on the data.
The published model: Yehor/wav2vec2-xls-r-300m-uk-with-small-lm-noisy · Hugging Face
The noised data: GitHub - egorsmkv/speech-recognition-uk: Speech Recognition for Ukrainian
This model is trained for Ukrainian.
I have posted metrics in the HF page.