Wav2vec2 decoding with pyctcdecode no whitespaces

hdvos · October 13, 2021, 9:29am

I hope I can ask this question, despite it not being on the huggingface package. I am also not sure if I am asking this in the right forum. Please indicate if I am in the wrong place.

I am trying to decode the wav2vec2 logits with pyctcdecode (see: GitHub - kensho-technologies/pyctcdecode: A fast and lightweight python-based CTC beam search decoder for speech recognition.) instead of the greedy decoder of huggingface wav2vec2. The output looks great (oftentimes better than the wav2vec2 decoder), however, it sometimes misses a lot of spaces generating output looking like

eurymembersaregoingtofocusintheirquestionsiamassuremdmthatafteryuelectionthecooperation
where the reference text is
juri members are going to focus in their questions i am sure madam that after your election the cooperation

Any suggestions for the root of this problem and possibly how to fix it?

My setup:

wav2vec2 model and processor: facebook/wav2vec2-base-10k-voxpopuli-ft-e
arpa model: voxpopuli_en_5gram_lm downloaded from: voxpopuli/README.md at main · facebookresearch/voxpopuli · GitHub

EDIT: since I posted this question someone mentioned me this issue which might contain the answer will start investigating now.: Unexpected spacing with Huggingface wav2vec library · Issue #25 · kensho-technologies/pyctcdecode · GitHub

Topic		Replies	Views
[Question] Wav2vec2 word times 🤗Transformers	2	2945	June 24, 2021
[STT] Using huggingface pretrained models but different results =>Wav2Vec2 vs PatrickDemo 🤗Transformers	0	445	December 27, 2021
Wav2vec2-xls-r-2b-22-to-16 sample code not running Models	1	696	March 18, 2022
Wav2vec: how to run decoding with a language model? Beginners	6	6407	August 24, 2022
File size/speech length limit for Wave2Vec2? Beginners	4	2369	June 24, 2023

Wav2vec2 decoding with pyctcdecode no whitespaces

Related topics