Load fine tuned model from local

Hey,

if I fine tune a BERT model is the tokneizer somehow affected?

If I save my finetuned model like:

bert_model.save_pretrained(’./Fine_tune_BERT/’)

is the tokenizer saved too? is the tokenizer modified? Do I need to save it too?

Bercause loading the tokenizer like:

tokenizer = BertTokenizer.from_pretrained(‘Fine_tune_BERT/’)

is giving error.
>
> OSError: Model name ‘Fine_tune_BERT/’ was not found in tokenizers model name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese, bert-base-german-cased, bert-large-uncased-whole-word-masking, bert-large-cased-whole-word-masking, bert-large-uncased-whole-word-masking-finetuned-squad, bert-large-cased-whole-word-masking-finetuned-squad, bert-base-cased-finetuned-mrpc, bert-base-german-dbmdz-cased, bert-base-german-dbmdz-uncased, TurkuNLP/bert-base-finnish-cased-v1, TurkuNLP/bert-base-finnish-uncased-v1, wietsedv/bert-base-dutch-cased). We assumed ‘Fine_tune_BERT/’ was a path, a model identifier, or url to a directory containing vocabulary files named [‘vocab.txt’] but couldn’t find such vocabulary files at this path or url.

SO I assume I can load the tokenizer in the normal way?

The model is independent from your tokenizer, so you need to also do:

tokenizer.save_pretrained(’./Fine_tune_BERT/’)

to be able to load it back with from_pretrained.

But the important issue is, do I need this? Can I still download it the normal way? Is the tokenizer affected by model fientuning? I assume no, so I could still use the tokenizer from your API?

So:

tokenizer = BertTokenizer.from_pretrained(‘bert-base-cased’)

but

bert_model = TFBertModel.from_pretrained(‘Fine_tune_BERT/’)

Yes, if you didn’t do any change to the tokenizer, you can still use the pretrained version.

1 Like