Recovering input IDs from input embeddings using GPT-2

Suppose I have the following text

aim = 'Hello world! you are a wonderful place to be in.'

I want to use GPT2 to produce the input_ids and then produce the embedding and from embeddings recover the input_ids, to do this I do:

from transformers import GPT2Tokenizer, GPT2Model
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2Model.from_pretrained("gpt2")

The input_ids can be defines as:

input_ids = tokenizer(aim)['input_ids']
#output: [15496, 995, 0, 345, 389, 257, 7932, 1295, 284, 307, 287, 13]

I can decode this to make sure it reproduce the aim:

#output: 'Hello world! you are a wonderful place to be in.'

as expected! To produce the embedding I convert the input_ids to tensor:

input_ids_tensor = torch.tensor([input_ids])

I can then procude my embeddings as:

# Generate the embeddings for input IDs 
with torch.no_grad():
    model_output = model(input_ids_tensor)
    last_hidden_states = model_output.last_hidden_state
# Extract the embeddings for the input IDs from the last hidden layer
input_embeddings = last_hidden_states[0,1:-1,:]

Now as mentioned earlier, the aim is to use input_embeddings and recover the input_ids, so I do:

x = torch.unsqueeze(input_embeddings, 1) # to make the dim acceptable
with torch.no_grad():
    text = model(x.long())
    decoded_text = tokenizer.decode(text[0].argmax(dim=-1).tolist())

But doing this I get:

IndexError: index out of range in self

at the level of text = model(x.long()) I wonder what am I doing wrong? How can I recover the input_ids using the embedding I produced?

Hey @iliboy

I’ve read a few misconceptions/incorrect terminology, so let me try to clear them out.

When you call model(input_ids_tensor), you get the model output. GPT2 was trained to predict the next word, so the model output is related to the next word in the sequence. What you described as “input_embeddings” in your code example is closer to “output_embeddings” :slight_smile:

If you want the model input embeddings, you can do input_embeddings = model.wpe(input_ids_tensor). Now, embeddings are a one-way street, so the best chance you have at reversing the embeddings is to apply a vector-comparison operation like cosine similarity. I’d suggest to search and read more on these topics if you want to pursue them :open_book: