Hi, just quickly getting started with GPT2.
from https://huggingface.co/gpt2 :
from transformers import GPT2Tokenizer, GPT2Model
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2Model.from_pretrained('gpt2')
text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)
is said to yield the features of the text.
Upon inspecting the output, it is an irregularly shaped tuple with nested tensors. Looking at the source code for GPT2Model, this is supposed to represent the hidden state. I can guess what some of these dimensions represent, for example the 768
dimension is obviously the word embedding, but in general I can’t find any documentation about interpreting the information in output
I also tried adding:
output = model(**encoded_input, output_attentions = True)
but I do not know how to interpret the dimensions of this either.
I am told to “See attentions
under returned tensors for more detail.” in the docstring at https://huggingface.co/transformers/_modules/transformers/modeling_gpt2.html#GPT2Model
But I cannot find what this is referring to. Can someone help me interpret the dimensions of these nested tuples?