RuntimeError: blank must be in label range

Hi, I’m trying to train a tamil model. I ran the code as explained in Patrick’s video. But I ran into this error. Can you help me what the reason for this?

This is my colab notebook

Here a shareable link of the notebook:

The colab seems to work fine with me - it’s training when I run it:

you shall set the vocab_sizee in Wav2Vec2ForCTC.from_pretrained()

This also happens if the token you have selected is part of the language vocab. In hindi (or other devnagari scripts) the pipe "|" is used instead of a full-stop. So be careful to select a token which is not part of the normal language vocab