I think my problem here is due to transformer version mismatch… but I would like some help with this…
Previously I used the huggingface library to perform language model fine tuning. This takes a corpus, an existing BERT model, and fine tune that model using this corpus. My command was
python run_language_modelling.py --output_dir=lm_finetune --model_type=bert --model_name_or_path=bert-base-uncased --do_train --train_data_file=thread0_wdc20.txt --do_eval --eval_data_file=wiki.test.raw --mlm --save_total_limit=1 --save_steps=2 --line_by_line --num_train_epochs=2
I fine-tuned the models successfully, and this created a folder that contained the following files:
checkpoint-183236 config.json eval_results_lm.txt lm_finetune pytorch_model.bin special_tokens_map.json tokenizer_config.json training_args.bin vocab.txt
And I also successfully loaded this fine-tuned language model for downstream tasks.
The problem is that I don’t remember the versions of the libraries I used to do all these - pytorch, transformers, tensorflow…
Recently, I am experimenting something that required me to re-install these libraries. Their versions are now:
tensorflow-gpu 2.2.0
transformers 3.0.2
pytorch 1.4.0
torchtext 0.5.0
And when I use this environment to reload those previously fine-tuned language models, I get this error:
File "/home/li1zz/.conda/envs/tensorflow-gpu/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/home/li1zz/.conda/envs/tensorflow-gpu/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/li1zz/wop_matching/src/exp/run_bert_standard.py", line 313, in <module>
bert_model = transformers.TFBertModel.from_pretrained(bert_model)
File "/home/li1zz/.conda/envs/tensorflow-gpu/lib/python3.6/site-packages/transformers/modeling_tf_utils.py", line 437, in from_pretrained
[WEIGHTS_NAME, TF2_WEIGHTS_NAME], pretrained_model_name_or_path
OSError: Error no file named ['pytorch_model.bin', 'tf_model.h5'] found in directory /home/li1zz/bert/lm_finetune_proddesc/lm_finetune_part00 or `from_pt` set to False
Obviously, the file that is missing now is
tf_model.h5
I don’t understand how I had this error - the fine-tuned models worked for sure before. And the only thing I can think of is the version mismatch. I.e., I fine-tuned those models using a version of the libraries that are incompatible with the ones I am now using, as one file is missing.
Can anyone provide some insights to this? Am I using wrong versions of the libraries? How can I fix this without re-doing all the language model finetuning using this new environment again?