I am trying to freeze
deberta-v3-small layers. I first froze 4 blocks with:
NUM_FROZEN_LAYERS = 83 # <--- This index correspond to last layer of block 4 for i,(name, param) in enumerate(list(model.named_parameters())\ [0:NUM_FROZEN_LAYERS]): param.requires_grad = False
This works fine. However, I later wanted to freeze 6 blocks (up to layer number 115) and the following error was raised:
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
I played around and found out that after layer number 103 this error starts appearing. This layer is
This looks very random. Does anyone know why is this happening and if there is another way to freeze certain blocks? Maybe I am not doing it the right way.