Hello, i have some questions about wav2vec2.
I have a finetuned model without LM and also one with LM. And even the one with LM keep returning words out of the wordlist. I’ve read that beam search decoder doesn’t avoid the model to return an invented word that do not exist in the wordlist, and the LM just helps in the repunctuation process and misspellings. I’ve seen a way to force the valid output words to the ones from a lexicon but also doesn’t work well.
First question is about how this model proceeds with oovs words in the decoding process, if needs at least some aparitions of this word in train to learn the speech representations of this word so the model transcripted word got sense and and the LM helps in this case. If this word wasn’t in train the LM can do nothing and don’t help in this cases cause model doesn’t “know” this word??
My second doubt i think is also related. When you fine tune the model with specific domain words are you making this model “good” only in this context? So that’s why a test set with out of train words gives you worst results than test from same context?