Load model weights in a different model architecture

luka · September 9, 2021, 2:41pm

I would like to load the weights of a DistilBertModel architecture (’./my_model’) into a DPRQuestionEncoder architecture.

To do that I’ve tried:

model = DPRQuestionEncoder.from_pretrained('./my_model')

# This raises the following error:
NotImplementedError: Make sure `_init_weigths` is implemented for <class 'transformers.models.dpr.modeling_dpr.DPRQuestionEncoder'>

model = DPRPretrainedQuestionEncoder.from_pretrained('./my_model')

# This seems to work, but many layers are not initialized from weights, but it should be ok,
# but when I try to save the model
model.save_pretrained('./my_model_DPR')
# it raises the following error:

StopIterationTraceback (most recent call last)
/opt/conda/lib/python3.8/site-packages/transformers/modeling_utils.py in get_parameter_dtype(parameter)
    129     try:
--> 130         return next(parameter.parameters()).dtype
    131     except StopIteration:

StopIteration: 

During handling of the above exception, another exception occurred:

StopIterationTraceback (most recent call last)
<ipython-input-5-0de030754651> in <module>
----> 1 model.save_pretrained('./semanticsearch_model_converted2DPR')

/opt/conda/lib/python3.8/site-packages/transformers/modeling_utils.py in save_pretrained(self, save_directory, save_config, state_dict, save_function, push_to_hub, **kwargs)
    977         # save the string version of dtype to the config, e.g. convert torch.float32 => "float32"
    978         # we currently don't use this setting automatically, but may start to use with v5
--> 979         dtype = get_parameter_dtype(model_to_save)
    980         model_to_save.config.torch_dtype = str(dtype).split(".")[1]
    981 

/opt/conda/lib/python3.8/site-packages/transformers/modeling_utils.py in get_parameter_dtype(parameter)
    137 
    138         gen = parameter._named_members(get_members_fn=find_tensor_attributes)
--> 139         first_tuple = next(gen)
    140         return first_tuple[1].dtype
    141

Topic		Replies	Views
Loading pytorch_pretrained_bert models with transformers Beginners	2	1902	April 29, 2021
How to find back the architecture of a pytorch model having only the weight dictionnary ? Beginners	0	353	May 3, 2021
Loading a trained model gives an error that weights are randomly initialized 🤗Transformers	0	470	June 6, 2023
[Solved] Issue on translating DPR to TFDPR on loading pytorch weights to TF model 🤗Transformers	2	518	October 29, 2020
Load EncoderDecoderModel from a checkpoint Models	0	294	March 9, 2023

Load model weights in a different model architecture

Related topics