I followed the Fine-tuning the ASR model course to fine tune the Wisper-Small model on one of the Common Voice languages. Since the training data was too large to fit into Google Colab’s disk space, I pre-processed and split the data into two parts. After training on both splits, the model outputs the same word repeatedly for any inputs. I am curious to know what could be causing this issue and if anyone has experienced a similar problem. Have anyone run into similar issue before?