NameError: name 'BertTokenizer' is not defined

Hi,

I am trying to add custom tokens using this code below:

# Let's see how to increase the vocabulary of Bert model and tokenizer
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = AutoModelForMaskedLM.from_pretrained('bert-base-uncased')

num_added_toks = tokenizer.add_tokens(['token_1'])
print('We have added', num_added_toks, 'tokens')
model.resize_token_embeddings(len(tokenizer))  # Notice: resize_token_embeddings expect to receive the full size of the new vocabulary, i.e. the length of the tokenizer.

Though, when executing the above code, I get this error:

---------------------------------------------------------------------------

NameError                                 Traceback (most recent call last)

<ipython-input-36-31798d520617> in <module>()
      1 # Let's see how to increase the vocabulary of Bert model and tokenizer
----> 2 tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
      3 model = AutoModelForMaskedLM.from_pretrained('bert-base-uncased')
      4 
      5 num_added_toks = tokenizer.add_tokens(['token_1'])

NameError: name 'BertTokenizer' is not defined

hey @anon58275033 what version of transformers are you using? i was not able to reproduce the error in v4.6.1

1 Like

Hi @lewtun - I have actually fixed the error by adding this line of code:

from transformers import BertTokenizer, BertForMaskedLM
1 Like