Is there a way to correctly load a pre-trained transformers model without the configuration file?

BramVanroy · August 12, 2021, 6:00pm

This is telling you that the checkpoint that they gave you also includes the state of other things. So they also saved the state of the optimizer and not just the state of the model. It seems that you need to only load the “model” key. Maybe there is a better way than this, but I think you can do:


MODEL_PATH = "./checkpoint.pt"
state_dict = torch.load(MODEL_PATH)["model"]
config = AutoConfig.from_pretrained("./bert_config.json")
model = BertModel(config)

model = BertModel._load_state_dict_into_model(
    model,
    state_dict,
    MODEL_PATH
)[0]

# make sure token embedding weights are still tied if needed
model.tie_weights()

# Set model in evaluation mode to deactivate DropOut modules by default
model.eval()

I did not test this. See this for more:

github.com

huggingface/transformers/blob/e46ad22cd6cb28f78f4d9b6314e7581d8fd97dc5/src/transformers/modeling_utils.py#L1381-L1384

    
      
          @classmethod
          def _load_state_dict_into_model(
              cls, model, state_dict, pretrained_model_name_or_path, ignore_mismatched_sizes=False, _fast_init=True
          ):

Topic		Replies	Views
Load weight from local ckpt file Beginners	9	9010	February 23, 2021
Load EncoderDecoderModel from a checkpoint Models	0	305	March 9, 2023
Differences between Config.from_pretrained and Model.from_pretrained 🤗Transformers	1	1177	July 20, 2021
Loading a checkpoint from training GPT2LMHeadModel 🤗Transformers	0	473	May 23, 2023
Loading pytorch_pretrained_bert models with transformers Beginners	2	1923	April 29, 2021

Is there a way to correctly load a pre-trained transformers model without the configuration file?

Related topics