I am trying to run inference on a pre-trained wav2vec2 model on raspberry pi4. I am using example code at the end of this page: Wav2Vec2 — transformers 4.5.0.dev0 documentation. I am getting the predicted_ids as all 0s. to compare the results, i also ran the exact same code (with execution targets set to “cpu” in both cases) on google colab. on colab the predicted_ids are as expected and produces reasonable transcription unlike raspberry pi. digging further, I noticed that the input values (i.e the output from feature extraction) are same on both rapsberry pi and google colab, but the logits output from model are different in both cases. that is what is causing predicted ids to be 0 on RBPi4.
Do I need to build model differently to run on raspberry pi(an arm64 architecture)? or Do i need to pass some additional prams during model inference?
@patrickvonplaten, thank you writing awesome blog Fine-Tune Wav2Vec2 for English ASR in Hugging Face with 🤗 Transformers. I tried following your the steps in evaluation section and also tried using your pre-trained models instead if base wav2vec. i keep getting all ids as 28 on raspberry pi where as on google colab i get reasonable output. any suggestions on how to make this working on raspberry pi 4?