Wav2Vec2 For Indian English

Vishaal · March 24, 2021, 6:44pm

I’m trying to build an Automatic Speech Recognition model for Indian English ( accents, dialect, etc.). I have around 15 hours of labeled data.

I followed the steps blog by @patrickvonplaten replacing the TIMIT dataset with my own keeping everything else the same. After training, the WER is a perfect 1.0.

The trained model outputs blank for every file in the test set and I don’t know where it is going wrong.
Any help would be much appreciated. Is anyone else attempting this?

infinitejoy · March 27, 2021, 2:10pm

WER 1.0 is not a very good metric by itself. If you are not getting anything in the output this may be because the model has not learnt anything and there are some silent errors happening. Try increasing the epochs or other tuning methodologies and see if this resolves the issue.

Vishaal · April 21, 2021, 6:13pm

Thank you for the suggestion. I increased the number of epochs and it fixed the issue.

infinitejoy · April 22, 2021, 9:31am

How is the model working? It would be great if you could open source it.

sunilkunchoor · November 17, 2021, 9:55am

@Vishaal Any update on the model you are building? It would be great if you can share the solution for the WER 1.0 error.

Topic		Replies	Views
Hindi ASR: Fine-Tuning Wav2Vec2 Languages at Hugging Face	19	3002	January 4, 2022
Indonesian ASR: Fine-Tuning Wav2Vec2 Languages at Hugging Face	35	2560	March 1, 2023
Wav2vec2 not converging when finetuning 🤗Transformers	7	2531	June 15, 2021
Wav2vec2-base task performance Models	4	889	May 8, 2023
Thai ASR: Fine-Tuning Wav2Vec2 Languages at Hugging Face	0	1022	March 18, 2021

Wav2Vec2 For Indian English

Related topics