Vocab_size value for facebook/w2v-bert-2.0

Hi,

I was trying to use AutoModelForCTC.from_pretrained("facebook/w2v-bert-2.0") to load the w2v-bert model, but I always get the error:

File “/home/jcsilva/huggingsound/.venv/lib/python3.11/site-packages/transformers/models/wav2vec2_bert/modeling_wav2vec2_bert.py”, line 1185, in init
raise ValueError(
ValueError: You are trying to instantiate <class ‘transformers.models.wav2vec2_bert.modeling_wav2vec2_bert.Wav2Vec2BertForCTC’> with a configuration that does not define the vocabulary size of the language model head. Please instantiate the model as follows: Wav2Vec2BertForCTC.from_pretrained(..., vocab_size=vocab_size). or define vocab_size of your model’s configuration.

Investigating the issue, I saw two possible causes:

  1. The vocab_size param defined in the config file at config.json · facebook/w2v-bert-2.0 at main is equal to null. @reach-vb or @ylacombe , would it be possible to remove this param (vocab_size) from the model config file? If not, what do you think about setting any valid value (e.g 32, such as what we see at config.json · facebook/wav2vec2-large-xlsr-53 at main).

  2. The vocab_size default value for W2VBert model is None (please see it here), but it is 32 for Wav2Vec2 models as you can see here. Could we have both vocab_size default value as 32? This way the ValueErrorexception I mentioned in this ticket is not seen when using AutoModelForCTC.

Thank you

2 Likes