Hi,
I was trying to use AutoModelForCTC.from_pretrained("facebook/w2v-bert-2.0")
to load the w2v-bert model, but I always get the error:
File “/home/jcsilva/huggingsound/.venv/lib/python3.11/site-packages/transformers/models/wav2vec2_bert/modeling_wav2vec2_bert.py”, line 1185, in init
raise ValueError(
ValueError: You are trying to instantiate <class ‘transformers.models.wav2vec2_bert.modeling_wav2vec2_bert.Wav2Vec2BertForCTC’> with a configuration that does not define the vocabulary size of the language model head. Please instantiate the model as follows:Wav2Vec2BertForCTC.from_pretrained(..., vocab_size=vocab_size)
. or definevocab_size
of your model’s configuration.
Investigating the issue, I saw two possible causes:
-
The
vocab_size
param defined in the config file at config.json · facebook/w2v-bert-2.0 at main is equal tonull
. @reach-vb or @ylacombe , would it be possible to remove this param (vocab_size) from the model config file? If not, what do you think about setting any valid value (e.g 32, such as what we see at config.json · facebook/wav2vec2-large-xlsr-53 at main). -
The
vocab_size
default value for W2VBert model is None (please see it here), but it is 32 for Wav2Vec2 models as you can see here. Could we have bothvocab_size
default value as 32? This way theValueError
exception I mentioned in this ticket is not seen when usingAutoModelForCTC
.
Thank you