I’m trying to use the wav2vec2-base-960h model by facebook for speech recognition.
I tried this code on some flac files and it’s working, but I get a warning:
Some weights of Wav2Vec2ForCTC were not initialized from the model checkpoint at facebook/wav2vec2-base-960h and are newly initialized: ['wav2vec2.masked_spec_embed'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
what are the weights talked about? how do I have an access to them? how do I train the model?
I’ll be glad for some explanation