Language model for wav2vec2.0 decoding

Hello, I implemented wav2vec2.0 code and a language model is not used for decoding. How can I add a language model (let’s say a language model which is trained with KenLM) for decoding @patrickvonplaten ?

thanks in advance.

Note: I also opened an issue, but redirected here.

3 Likes

Hey Emre!

Yeah good question - we currently don’t support evaluating with a language model, but we plan on adding this functionality soon! It’s sadly not that trivial to decode a CTC model with a language model. I’ll try to keep you posted for updates here!

12 Likes

Assuming that one has a kenlm model already, am I wrong to assume that’s it’s just a matter of giving the wav2vec2 output logits as argument to the ctcdecode main function, exemplified here: GitHub - parlance/ctcdecode: PyTorch CTC Decoder bindings?

Or is there more to it than that?

@EmreOzkose Good question i think. But i don’t start this as professional level. I’m currently searching on this. :slightly_smiling_face:

Hi all, I’ve been experimenting kenlm with wav2vec2 here is the notebok
I dont know if this is a proper implementation, but it works!
I also still need to cleanup some stuff like vocab & other thing.

1 Like

@Wikidepia Can you share how much it improved your WER score? Also, did you tried character level LM as well?