Load frozen layers from one checkpoint and new layers from second checkpoint?

I have a custom model which adheres to the hugginface spec which allows me to used save_pretrained and from_pretrained to correctly save and load models from local directories.

After pre-training the model I want to finetune the model with extra layers. All the original layers stay frozen and new layers are introduced for the finetuning task. When I save this new model I want to be able to save only the unfrozen layers to save disk space, and then when loading this model I want to load the original pre-trained layers from the first checkpoint and the new layers from the second checkpoint. Is this at all possible using the functions provided by the transformers library? This is similar to loading adaptors from a PEFT model, however I am not using the PEFT library and nor am I actually using actual adaptors as these are entirely new components present in the fine-tuned model class, but absent in the pre-trained model class.

I want to be able to do something like this:

modelA = MyCausalModel()
train( modelA )
modelA.save_pretrained( '/modelA' )

...

modelB = MyCausalModelWithHead.from_pretrained( '/modelA' )
modelB.backbone._requires_grad( False )
modelB.new_layers._requires_grad( True )
train( modelB )
modelB.save_pretrained( '/modelB', only_new_layers=True )

...

restored_modelB = MyCausalModelWithHead.from_pretrained( 'modelA', 'modelB' )
1 Like