Is there a way to correctly load a pre-trained transformers model without the configuration file?

BramVanroy · August 13, 2021, 1:08pm

I think you should be able to do

model.resize_token_embeddings(30528)

before you load the state dict. The state should then load successfully. However, as you point out, it is likely that they added tokens to the tokenizer, so you should get their tokenizer files as well. Then it would be as simple as:

MODEL_PATH = "./checkpoint.pt"
state_dict = torch.load(MODEL_PATH)["model"]
config = AutoConfig.from_pretrained("./bert_config.json")
tokenizer = <load tokenizer here>
model = BertModel(config)

model.resize_token_embeddings(len(tokenizer)) 

model = BertModel._load_state_dict_into_model(
    model,
    state_dict,
    MODEL_PATH
)[0]

# make sure token embedding weights are still tied if needed
model.tie_weights()

# Set model in evaluation mode to deactivate DropOut modules by default
model.eval()

Topic		Replies	Views
Load weight from local ckpt file Beginners	9	9010	February 23, 2021
Load EncoderDecoderModel from a checkpoint Models	0	305	March 9, 2023
Differences between Config.from_pretrained and Model.from_pretrained 🤗Transformers	1	1177	July 20, 2021
Loading a checkpoint from training GPT2LMHeadModel 🤗Transformers	0	473	May 23, 2023
Loading pytorch_pretrained_bert models with transformers Beginners	2	1923	April 29, 2021

Is there a way to correctly load a pre-trained transformers model without the configuration file?

Related topics