ASR spell correction

pierreguillou · October 11, 2021, 4:17pm

Hi.

When searching for solutions about ASR errors corrections, I found this topic in the HF forum.

I would like to discuss with you about 2 models.

FastCorrect

Recently, Microsoft Asia published FastCorrect paper (and more recently, FastCorrect 2). I like the 2 main ideas on which is based this model:

Training of an edit distance model in order to adapt the tokens number of the source (sentence with errors from the ASR output) to the one of the target (sentence without errors): thus, the decoder input will have the right number of tokens (the target one) and can focus on finding the right tokens (if necessary) corresponding to the decoder input tokens.
Use of non-autoregressive (NAR) decoder in order to predict in parallel all the target tokens: this NAR can speed up by 9 the prediction of all target tokens in comparison to the use of an autogressive decoder. This is a proposed solution to use such an ASR errors correction model in real-time.

Interesting, no? What do you think of FastCorrect?

However, I did not find any released code.

paper : FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition (last revised 1 Oct 2021)

T5 (or ByT5)

flexudy published no model hub of HF Sentence doctor (github) that is a T5 model that attempts to correct the errors or mistakes found in sentences (model works on English, German and French text). The training script is provided (train_any_t5_task.py): it should looks like the HF translation scripts / HF translation notebook but flexudy explains it used Abhishek Kumar Mishra’s transformer tutorial on text summarization (see as well HF summarization notebook).

Interesting, no? What do you think of using T5 (or ByT5) for ASR errors correction?

Note: as T5 decoder is auto regressive, I guess the sentence doctor could not be used for ASR errors correction in real time. Any thoughts about this issue (real time)?

Topic		Replies	Views
Ideas to correct Wav2Vec2 transcription results Beginners	1	1000	May 11, 2021
Hindi ASR: Fine-Tuning Wav2Vec2 Languages at Hugging Face	19	3006	January 4, 2022
Pre-training/fine-tuning Seq2Seq model for spelling and/or grammar correction in English Flax/JAX Projects	7	7171	October 11, 2021
Live Transcription/ASR Beginners	0	1640	September 18, 2022
I want to custom my data set in speech recognition wav2vec Beginners	1	828	August 9, 2021

ASR spell correction

FastCorrect

T5 (or ByT5)

Related topics