Hi, I want to output the 20th GPT2Block in a GPT2 medium model (24 GPT2Block blocks in total). I have used register_forward_hook
and output_hidden_state
separately, but they give different results.
My code is as follows:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, AutoConfig
model_config = AutoConfig.from_pretrained('gpt2-medium', output_hidden_states=True, return_dict_in_generate=True)
model = AutoModelForCausalLM.from_pretrained('gpt2-medium', config=model_config).cuda()
# define hook
def hook(module, fea_in, fea_out): # collect output
nonlocal features_in_hook
features_in_hook = fea_out.clone().detach()
return fea_out
model.eval() # turn dropout and layernorm into eval mode
features_in_hook = None # to save output
for (name, module) in model.named_modules():
if name == 'transformer.h.19.mlp.dropout':
h = module.register_forward_hook(hook=hook)
prompt_tok = tok(["Who are you?", "What university are you in?"], padding=True, return_tensors="pt").to("cuda")
hidden_state = model(**prompt_tok)[2][20]
If I have done everything right, hidden_state
should be the same as features_in_hook
. Since the length of model(**prompt_tok)[2]
is 25 and the first is word embedding, the output of the 20th block should be in index 20. However, the two results obtained in different ways are not the same. Have I done something wrong?