I am using gpt2 pretrained model from huggingface transformer, is there any way to load the weights of FFNN layer of every block but not the weights of attention layer, since i am modifing the self attention layer
I am using gpt2 pretrained model from huggingface transformer, is there any way to load the weights of FFNN layer of every block but not the weights of attention layer, since i am modifing the self attention layer