Speech detection in records from SIP-telephony with G.729 codec

dkurdomanenko · May 12, 2024, 9:19pm

I need to implement the process of additional training of a multilingual speech recognition model in such a way as to improve the quality of recognition on data received via SIP telephony (G.729 codec). Is it a good idea to use xlsr-wav2vec2 for this task after decoding my records to .wav or there is a model that trained in G.729 codec records?

Topic		Replies	Views
I want train my own model speech recognation localy on my data my voice how to do that I can't find start I need very help 🤗Datasets	0	369	December 7, 2021
Baseline vs language-specific finetuned model for multilingual speech recognition 🤗Transformers	0	315	September 20, 2022
Polish ASR: Fine-Tuning Wav2Vec2 Languages at Hugging Face	0	442	March 19, 2021
Phoneme Recognition Model 🤗Transformers	1	391	September 25, 2021
Swiss-German ASR: Fine-Tuning Wav2Vec-XLSR Languages at Hugging Face	0	557	March 18, 2021

Speech detection in records from SIP-telephony with G.729 codec

Related topics