I am using Wav2Vec2CTCTokenizer.from_pretrained to read in the Facebook base librispeech model:
tokenizer = Wav2Vec2CTCTokenizer.from_pretrained(‘facebook/wav2vec2-base-960h’)
I am seeing some behavior I am not sure I follow. It seems that if I have a vocab.json file already in the same directory from where I am running the above command, it ignores the vocab.json file in the base model and uses the one in my directory. Is this correct, and if so, where is this happening in the source code - I cannot find it.