If this discussion is still ongoing, then there is a pull request Added Feature: Prefix decoding for wav2vec2 models by deepang17 · Pull Request #11606 · huggingface/transformers · GitHub currently open, and as @ChristophBensch mentions a means of using KenLM from GitHub - parlance/ctcdecode: PyTorch CTC Decoder bindings. We have an example of this at GitHub - techiaith/docker-wav2vec2-xlsr-ft-cy: Hyfforddi modelau adnabod lleferydd Cymraeg wav2vec2 a KenLM a'u darparu drwy weinydd gwasanaeth API // Train wav2vec2 and KenLM models for Welsh language speech recognition and/or provide via a simple API server. that’s reduced our WER score for Welsh from 25% to 15%. Since our scripts use HuggingFace’s OSCAR dataset, they should be easily adaptable to train and optimize LMs for other lesser resourced languages as well.
2 Likes