How to reset a layer?

Hi, I am looking for a solution to reset a layer in the pre-trained model. For example, like in a BART model, if I am going to reset the last layer of the decoder, how should I implement it?

I notice we have the _init_weights(), which should be helpful. So I am wondering if the code should be like:

# load the pre-trained model
model = AutoModelForSeq2SeqLM.from_pretrained("facebook/bart-large")
# reset a specific layer
model._init_weights(model.get_decoder().layers[n:n+1])

But I don’t think I make it correct because the fine-tuning result doesn’t change. Any ideas on this implementation? Thank you!

1 Like

I am also stuck on this. Looking at the code of _init_weights, it looks like it expects individual modules like nn.Linear.

This would require looping over all the modules of your model that you would like to re-initialize and passing them to _init_weights. But this might not translate to a new model, as their layer structure could be different. Is there not a way to just re-initialize a whole layer? Or all modules under some component (e.g. BertLayer)?

Okay I think I have figured it out. You can recursively apply _init_weights to all submodules using apply. So, for example, if you wanted to re-initialize the last layer, the following should work:

from transformers import AutoModel

model = AutoModel.from_pretrained("bert-base-uncased")
# Print the weights before and after the call to _init_weights to confirm they have be re-initialized
# print(model.encoder.layer[-1].attention.output.dense.weight)
model.encoder.layer[-1].apply(model._init_weights)
# print(model.encoder.layer[-1].attention.output.dense.weight)
3 Likes