I have a pytorch model with BertModel as the main part and a custom head. I want to freeze the embedding layer and the first few encoding layers, so that I can fine-tune the attention weights of the last few encoding layers and the weights of the custom layers.
I tried:
ct = 0
for child in model.children():
ct += 1
if ct < 11: # ########## change value - this freezes layers 1-10
for param in child.parameters():
param.requires_grad = False
but I’m not sure that did what I want.
I then ran this to check, but the layer names aren’t recognized
L1bb is the name of the BertModel section in my model, and L1bb.embeddings.word_embeddings.weight is shown in the output of the code that instantiates the model.
How can I freeze the first n layers?
What counts as a layer?
What are the names of the layers in BertModel?
How can I check which layers are frozen?
PS how can I format this pasted code as code in the forum post? One section has done it automatically, but nothing seems to affect it.
You should not rely on the order returned by the parameters method as it does not necessarily match the order of the layers in your model. Instead, you should use it on specific part of your models:
modules = [L1bb.embeddings, *L1bb.encoder.layer[:5]] #Replace 5 by what you want
for module in mdoules:
for param in module.parameters():
param.requires_grad = False
will freeze the embeddings layer and the first 5 transformer layers.
hey @sgugger I tried your code with the BERT model, (model = BertModel.from_pretrained(“bert-base-cased”)
so the code would be:
modules = [model.embeddings, model.encoder.layer[:5]] #Replace 5 by what you want
for module in modules:
for param in module.parameters():
param.requires_grad = False
however the changes did not pass into the model, did I miss anything? how do I call the new model?