We tried to fine tune wav2vec2 model using the google colab shared by @patrickvonplaten using our own dataset on this google colab, we got significantly higher errors so we tried to recreate the results on timit dataset, but we still got higher errors, here is the link to it. I am not able to figure out what might be the reason for this. @patrickvonplaten could you look into this