I’m fine-tuning Whisper for a low-resource language (Chichewa) and following this tutorial. One change I have made is to provide a local directory to save the model instead of pushing to Hub. When its time to use the fine-tuned model using the pipeline module, I’m getting this error:
Can't load tokenizer for '/content/drive/My Drive/Chichewa-ASR/models/whisper-small-chich/checkpoint-1000. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure '/content/drive/My Drive/Chichewa-ASR/models/whisper-small-chich/checkpoint-1000' is the correct path to a directory containing all relevant files for a WhisperTokenizer tokenizer.
When I check the model repository used by the tutorial author, I see that it has several files I don’t have in my model checkpoint directory such as tokenizer_config.json
. I’m just wondering how do I ensure these files are saved in the local model checkpoint directories?
2 Likes
I had the same problem, have you solved it?
@sanchit-gandhi can you help with this, I have similar problem
I had same problem,
In tried to copy the tokenizer_config.json file from whisper model output directory to checkpoint directory
Then i load the model from checkpoint directory
It worked form me