Last layer hidden state: GPT2

Hi, I am trying to understand exactly where the last last hidden output in GPT2 stems from. The following code outputs hidden states from the embedding layer and all layers (1+12=13 layers):

tokenizer = GPT2Tokenizer.from_pretrained(‘gpt2’)
model = GPT2Model.from_pretrained(‘gpt2’)

input_ids = torch.tensor(tokenizer.encode(sent))
outputs = model(input_ids, output_hidden_states=True, return_dict=True)
hidden_states = result_model[‘hidden_states’]

As GPT2 has a LayerNorm layer after the very last decoder block (ln_f: LayerNorm(768)), are the last hidden states extracted before this final normalization layer (after the last decoder block) or after this final normalization layer?

I am running Transformers 3.1.0.

Thanks so much!