Live Transcription/ASR

Ekram · September 18, 2022, 8:56pm

Hello all,

I hope everything is going well with you guys.

I need an offline “Live ASR Engine” for my project. I used the following GitHub repo that implements live asr using wav2vec2 model: GitHub - oliverguhr/wav2vec2-live: A live speech recognition using Facebooks wav2vec 2.0 model.

The problem is I did not get the expected performance using this wav2vec2 model. I have two queries:

How I can improve the performance of ASR engine accuracy using the wav2vec2 model?
I found two other models from Huggingface: speech2text and speech2text2. I wanted to modify the above code repository to use these models for live transcription but failed to do so. Does anyone use these models to implement live transcription, if so please share your advice?

Topic		Replies	Views
Wav2vec2 and whisper ASR live streaming Models	1	768	May 15, 2023
Different versions of 'wav2vec2' model and their differences Beginners	1	1522	August 7, 2021
Use wav2vec2 models with a microphone easily Beginners	2	3671	May 28, 2021
Wav2vec2-base task performance Models	4	890	May 8, 2023
Online/streaming speech recognition Research	2	3038	October 26, 2022