Is there a way to correctly load a pre-trained transformers model without the configuration file?

This is telling you that the checkpoint that they gave you also includes the state of other things. So they also saved the state of the optimizer and not just the state of the model. It seems that you need to only load the “model” key. Maybe there is a better way than this, but I think you can do:


MODEL_PATH = "./checkpoint.pt"
state_dict = torch.load(MODEL_PATH)["model"]
config = AutoConfig.from_pretrained("./bert_config.json")
model = BertModel(config)

model = BertModel._load_state_dict_into_model(
    model,
    state_dict,
    MODEL_PATH
)[0]

# make sure token embedding weights are still tied if needed
model.tie_weights()

# Set model in evaluation mode to deactivate DropOut modules by default
model.eval()

I did not test this. See this for more:

2 Likes