Wav2vec2 finetuning and language model

Davide85 · October 1, 2023, 4:05pm

Hi there,
I’m working on the wav2vec2-base-960h finetuning for disordered speech recognition, my speech model has been trained on my custom dataset containing isolated words (no sentences) uttered by speakers with atypical voices. On isolated word recognition tasks, the performance of my speech model is very good, however the model does not recognize the sequence of more keywords within a single speech recording. How can I recognize two or more keywords in a recording? Should I use a language model? Any suggestions?
Thanks in advance,
Davide

Topic		Replies	Views
ASR help with sequence of words Beginners	1	313	March 25, 2024
Wav2Vec2 doubts Beginners	0	230	January 24, 2023
Wav2vec2 not converging when finetuning 🤗Transformers	7	2534	June 15, 2021
Further train a fine tuned wav2vec model 🤗Transformers	2	531	September 25, 2022
Dealing with proper nouns in wav2vec2 Models	0	500	May 8, 2022

Wav2vec2 finetuning and language model

Related topics