Load weight from local ckpt file

I have download a standard bert ckpt file ,
图片
,but how can I load this weights into model?

I have read the document:


and try some method like:

config = BertConfig.from_json_file('./bert_model/bert_config.json')
model =TFBertModel.from_pretrained('bert_model/bert_model.ckpt',config=config)
or
config = BertConfig.from_json_file('./bert_model/bert_config.json')
model = TFBertModel(config).load_weights('bert_model/bert_model.ckpt')

but it seem like doesn’t work.

Hi @Sniper, I’m not very familiar with the TensorFlow API of transformers but I think the following should work:

config = BertConfig.from_pretrained("path/to/your/bert/directory")
model = TFBertModel.from_pretrained("path/to/your/bert/directory", config=config)

If that doesn’t work, can you share the error that you get?

thanks for your reply.
I still have some question here.
should path/to/your/bert/directory is path/ or path/model.ckpt
If I use path here, there is an error:
OSError: Error no file named ['pytorch_model.bin', 'tf_model.h5'] found in directory bert_model/ or from_pt set to False
If I use path/model.ckpt ,there is another error:
**OSError: Unable to load weights from h5 file. If you tried to load a TF 2.0 model from a PyTorch checkpoint, please set from_pt=True.

It seem like the code can’t recognize my ckpt file.

code is like this:
config = BertConfig.from_json_file('./bert_model/bert_config.json')
model = TFBertModel.from_pretrained('bert_model/bert_model.ckpt', config=config)

As far as I know, the path/to/your/bert/directory in the `from_pretrained function should point to the root of the directory where the model / tokenizer files are stored, not individual files.

Can you share the contents of your directory? It seems there might be some files missing. Also, are you trying to load a model that was trained in PyTorch or TensorFlow?

Hi,@lewtun
I show my directory in the first paragraph. I can show it again.(I download this file at GitHub - google-research/bert: TensorFlow code and pre-trained models for BERT)
图片

It can be load by other framework like bert4keras.

In fact,I have a runable model write by bert4keras and I try to rewrite the code to tensorflow2(bert4keras base on tf1).

Thanks for clarification - I see in the docs that one can indeed point from_pretrained a TF checkpoint file:

A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index ). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

It seems that what you’re missing is the from_tf=True argument, so maybe something like the following works:

config = BertConfig.from_pretrained("path/to/your/bert/directory")
model = TFBertModel.from_pretrained("path/to/bert_model.ckpt.index", config=config, from_tf=True)

I’m not sure whether the config should be loaded with from_pretrained or from_json_file but maybe you can test both to see which one works :slight_smile:

I add from_tf in it.
But it have an error(the version of transformers is the lastest one)
TypeError: ('Keyword argument not understood:', 'from_tf')
It seem like parameter from_tf just use for pytorch,and tensorflow just have parameter from_pt.

docs are different in pytorch and tensorflow, tf version of docs just have from_pt.

Ah now I understand the source of your OSError: it seems that from_pretrained expects your model to be serialised as tf_model.h5 instead of the TensorFlow checkpoint format. You can see this in the source code here: transformers/modeling_tf_utils.py at master · huggingface/transformers · GitHub

If you’re happy to use the PyTorch API, it seems that it might be possible to load .ckpt files with from_pretrained: transformers/modeling_utils.py at master · huggingface/transformers · GitHub

In that case you could try something like

config = BertConfig.from_pretrained("path/to/your/bert/directory")
model = BertModel.from_pretrained("path/to/bert_model.ckpt.index", config=config, from_tf=True)

Maybe I should use pytorch load .ckpt and save the model.Then load with tensorflow?

I think I can use albert instead :frowning:

Thanks a lot for your patience help. Happy Chinese New Year!

1 Like

Yes that might work - good luck and happy chinese new year to you too :slight_smile: !