Online/streaming speech recognition

arkadyark · March 17, 2021, 12:22am

Are there plans to implement online decoding for the speech recognition models such as wav2vec2 and XLSR? More specifically, to be able to receive audio in short chunks, and output partial transcripts as they become available.

Motivation

Many use cases are covered by the current wav2vec2 model in the library, involving batch recognition of pre-recorded text. However for an online application that wanted to continuously recognize speech on a live input stream, this may not be sufficient.

randy912 · September 11, 2021, 6:50pm

I would very much like to know whether this is possible too! Have you gotten any further on this, @arkadyark?

Salama1429 · October 26, 2022, 8:19am

please check this one

Topic		Replies	Views
Wav2vec2 and whisper ASR live streaming Models	1	783	May 15, 2023
Use wav2vec2 models with a microphone easily Beginners	2	3683	May 28, 2021
Live Transcription/ASR Beginners	0	1674	September 18, 2022
Decding Large Audio Files Using Wav2Vec2ForCTC Model Models	2	742	October 28, 2021
Wav2vec2 for long audiofiles Beginners	2	4156	March 18, 2022

Online/streaming speech recognition

Motivation

Related topics