Whisper large-v3 finetuning

I’m trying to finetune whisper model using HuggingFace following this blog post Fine-Tune Whisper For Multilingual ASR with 🤗 Transformers and by adding Lora with approximatively 50h of annotated audio. While the finetuning works fine for the medium & large-v2 versions, when i try with the large-v3 the model starts to hallucinate after 400steps, and by hallucination I mean increasing WER with repeating words (i added prediction logs to see how the model reacts) (the training & eval loss are still decreasing). (modifié)

Hello! I have absolutely the same problem, but hallucinations and repetitions start from different number of steps, depends on size of my own datasets. Have you found the solution?

I have the same issue :frowning: . Fine-tuning the lagre-v3 is very terrible :frowning:

Is this perhaps a well-known problem?