I have a custom model which adheres to the hugginface spec which allows me to used save_pretrained
and from_pretrained
to correctly save and load models from local directories.
After pre-training the model I want to finetune the model with extra layers. All the original layers stay frozen and new layers are introduced for the finetuning task. When I save this new model I want to be able to save only the unfrozen layers to save disk space, and then when loading this model I want to load the original pre-trained layers from the first checkpoint and the new layers from the second checkpoint. Is this at all possible using the functions provided by the transformers
library? This is similar to loading adaptors from a PEFT model, however I am not using the PEFT library and nor am I actually using actual adaptors as these are entirely new components present in the fine-tuned model class, but absent in the pre-trained model class.
I want to be able to do something like this:
modelA = MyCausalModel()
train( modelA )
modelA.save_pretrained( '/modelA' )
...
modelB = MyCausalModelWithHead.from_pretrained( '/modelA' )
modelB.backbone._requires_grad( False )
modelB.new_layers._requires_grad( True )
train( modelB )
modelB.save_pretrained( '/modelB', only_new_layers=True )
...
restored_modelB = MyCausalModelWithHead.from_pretrained( 'modelA', 'modelB' )