Returned Tensors and Hidden State

azhx · August 27, 2020, 4:21am

Hi, just quickly getting started with GPT2.

from transformers import GPT2Tokenizer, GPT2Model
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2Model.from_pretrained('gpt2')
text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)

is said to yield the features of the text.
Upon inspecting the output, it is an irregularly shaped tuple with nested tensors. Looking at the source code for GPT2Model, this is supposed to represent the hidden state. I can guess what some of these dimensions represent, for example the 768 dimension is obviously the word embedding, but in general I can’t find any documentation about interpreting the information in output

I also tried adding:
output = model(**encoded_input, output_attentions = True)
but I do not know how to interpret the dimensions of this either.
I am told to “See attentions under returned tensors for more detail.” in the docstring at https://huggingface.co/transformers/_modules/transformers/modeling_gpt2.html#GPT2Model

But I cannot find what this is referring to. Can someone help me interpret the dimensions of these nested tuples?

prajjwal1 · August 27, 2020, 6:37am

Please refer to GPT2 docs. It will give a detailed description of what GPT2Model is supposed to return.

It returns (in order of output):

last_hidden_state : (batch_size, sequence_length, hidden_size)
past: (2, batch_size, num_heads, sequence_length, embed_size_per_head)
hidden_states
attentions

azhx · August 28, 2020, 1:11am

Can’t believe I missed this

valhalla · August 28, 2020, 3:48pm

Hey @azhx,

if you are on master then you can also use the ModelOutput object, which is a dict like object which lets you access the output as out.attentions, out.hidden_states etc. For GPT2Model model, it returns BaseModelOutputWithPast. You can find the docs here

azhx · September 5, 2020, 6:18am

Yes, I’m on master now and using this, thanks!

Topic		Replies	Views
Understanding attention output from generate method in GPT model Beginners	0	615	November 8, 2023
How to decode GPT2 🤗Transformers	3	7768	June 17, 2022
Last layer hidden state: GPT2 🤗Transformers	0	1942	March 23, 2021
How to calculate word and sentence embedding using GPT-2? Beginners	0	631	January 3, 2024
Recovering input IDs from input embeddings using GPT-2 Models	1	1251	March 1, 2023

Returned Tensors and Hidden State

Related topics