I am novice here and I am using different pretrained model other than Wav2Vec2. I am now playing with createWav2Vec2 py. provided by Pytorch. android-demo-app/create_wav2vec2.py at master 路 pytorch/android-demo-app 路 GitHub
I load the pretrained model from hugging face , but during the sanity check , the transcribed text is totally wrong.
Place I changed from
model = Wav2Vec2ForCTC.from_pretrained("facebook/wav2vec2-base-960h")
To
model1 = Wav2Vec2ForCTC.from_pretrained("patrickvonplaten/wav2vec2-base-timit-demo-colab")
Expected answer
Result: I HAD THAT CURIOSITY BESIDE ME AT THIS MOMENT
But i got
Result: J <pad></s>DJ<pad>F</s>DJF<pad>JBJSN JKJCJ JFJO<pad>YLJCJ L<pad>HL<pad> F<pad>F</s> JC<pad>JHKJHLRFJ<pad>
Could somebody advise what is wrong here?