Is there a way to correctly load a pre-trained transformers model without the configuration file?

milesc · August 12, 2021, 7:41pm

You are absolutely correct, checkpoint also includes the states of other things. I hadn’t noticed this! I have checked the keys with the code below:

MODEL_PATH = "./aerobert/phase2_ckpt_4302592.pt"
keys = torch.load(MODEL_PATH).keys()
keys

Output: dict_keys([‘model’, ‘optimizer’, ‘master params’, ‘files’])

If I look at the the files, there are quite a few files as below:

[3,
‘/local_workspace_data/bert/part-00879-of-00500.hdf5’,
‘/local_workspace_data/bert/part-00562-of-00500.hdf5’,
‘/local_workspace_data/bert/part-01703-of-00500.hdf5’,
‘/local_workspace_data/bert/part-01706-of-00500.hdf5’,
…]

If I run your code below, it produces an error:

MODEL_PATH = "./checkpoint.pt"
state_dict = torch.load(MODEL_PATH)["model"]
config = AutoConfig.from_pretrained("./bert_config.json")
model = BertModel(config)

model = BertModel._load_state_dict_into_model(
    model,
    state_dict,
    MODEL_PATH
)[0]

The error:

RuntimeError: Error(s) in loading state_dict for BertModel:
size mismatch for bert.embeddings.word_embeddings.weight: copying a param with shape torch.Size([30528, 1024]) from checkpoint, the shape in current model is torch.Size([30522, 1024]).

Does this mean the vocabulary of the saved model has 6 additional words?

Topic		Replies	Views
Load weight from local ckpt file Beginners	9	9010	February 23, 2021
Load EncoderDecoderModel from a checkpoint Models	0	305	March 9, 2023
Differences between Config.from_pretrained and Model.from_pretrained 🤗Transformers	1	1177	July 20, 2021
Loading a checkpoint from training GPT2LMHeadModel 🤗Transformers	0	473	May 23, 2023
Loading pytorch_pretrained_bert models with transformers Beginners	2	1923	April 29, 2021

Is there a way to correctly load a pre-trained transformers model without the configuration file?

Related topics