Gradual Unfreezing support for Fine tuning models

Does huggingface library has support for gradual unfreezing? As thats one of the key strategy to fine tune model effectively for down stream task.

1 Like

You will have to do that manually. The Trainer won’t do it automatically for you. To freeze the pytorch model, you can do something like this:

model = AutoModel.from_pretrained(model_id)
for param in model.parameters():
    param.requires_grad = False

In this way, you’d have more granular control over which layers you want to freeze. For your use case, you’d want to train just the linear layer initially. You can do this by freezing just the encoder

for param in model.encoder.parameters():
    param.requires_grad = False

You’d have to check if encoder is an attribute of all models to be fully sure. At later stages, you can turn the bool to True, to train the encoder.

I hope this gives you an idea.

4 Likes

I would be very happy to help support this if anyone has results that demonstrate its effectiveness!
There is also a freeze_params helper function here: https://github.com/huggingface/transformers/blob/e92efcf7286c955e6901f894be39cf6154af48b7/examples/seq2seq/utils.py#L277

1 Like

hi @sshleifer,

what counts as a “layer” for freezing?

I’ve created a model based on BertModel (with a custom head), and I’d like to freeze the embedding layer and some of the attention layers. I’ve applied

ct = 0
for child in model.children():
ct += 1
if ct < 11: # this freezes layers 1-10
for param in child.parameters():
param.requires_grad = False

but I’m not sure what that will have frozen.

How can I tell ?