Freeze Deberta Layers

I am trying to freeze deberta-v3-small layers. I first froze 4 blocks with:

NUM_FROZEN_LAYERS = 83 # <--- This index correspond to last layer of block 4

for i,(name, param) in enumerate(list(model.named_parameters())\
                                 [0:NUM_FROZEN_LAYERS]):
    param.requires_grad = False

This works fine. However, I later wanted to freeze 6 blocks (up to layer number 115) and the following error was raised:

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

I played around and found out that after layer number 103 this error starts appearing. This layer is model.encoder.layer.6.attention.self.value_proj.weight.

This looks very random. Does anyone know why is this happening and if there is another way to freeze certain blocks? Maybe I am not doing it the right way.

1 Like