Extracting Logits From T5 Output

melembroucarlitos · February 10, 2023, 4:39pm

Hello, I am trying to fine-tune the T5 model by hand through PyTorch. I was hoping to extract the logits from the model output, however The output is of class Seq2SeqModelOutput and does not contain a logits paramater.

What it has is: last_hidden_state, past_key_values, decoder_hidden_states, decoder_attentions, cross_attentions, encoder_last_hidden_state, encoder_hidden_states, encoder_attentions.

Is there anyway for me to extract the logits from one of these parameters?

Here is the code that I am running

import torch

dummy_test = 'Hello World'
input_ids = tokenizer(dummy_test, truncation=False).input_ids

input = input_ids
input_decoded = tokenizer.decode(input, truncation=False)
label = input_ids[1:]
label_decoded = tokenizer.decode(label, truncation=False)

input_tensored = tokenizer(input_decoded, return_tensors='pt', padding=True, truncation=True).input_ids
label_tensored = tokenizer(label_decoded, return_tensors='pt', padding=True, truncation=True).input_ids

outputs = model(input_ids=input_tensored, decoder_input_ids=input_tensored)

nbroad · February 11, 2023, 2:50am

last_hidden_state is the logits tensor

melembroucarlitos · February 11, 2023, 11:58am

Hmm, could you help me reason through the shape of this tensor?? I would expect it to be of shape (batch_size, context_window_size, vocab_size) but the documentation says that it has shape of (batch_size, sequence_length, hidden_size).

melembroucarlitos · February 12, 2023, 7:14pm

I figured it out… I think.

last_hidden_state is NOT the logits tensor. If it was, it would have dimensions batch_size, sequence_length, vocab_size (as opposed to hidden size). My best guess is that it is the last output of the last transformer block BEFORE we apply the unembedding layer to it which would give us the logits. However, the model which I was originally using (T5Model) did not have an option to extract the unembedding layer. So I switched over to T5ForConditionalGeneration.

nbroad · May 5, 2023, 11:18pm

Sorry, outputs.last_hidden_state is the output from the decoder. See here for details

caoyuqin · January 9, 2024, 9:46am

Hello! Can I ask how to extract the unembedding layer from T5ForConditionalGeneration?

Topic		Replies	Views
Untrained T5 model outputting logits that argmax to the decoder_input_ids Beginners	0	499	September 28, 2022
T5 models: About the decoder_input_ids argument Models	0	758	December 5, 2022
Questions about the shape of T5 logits Beginners	4	2590	September 23, 2021
T5 forward pass versus generate, latter outputs non-sense Beginners	8	2899	March 25, 2021
T5 Model Generate and Model Outputs Vastly Different Beginners	1	815	September 11, 2022

Extracting Logits From T5 Output

Related topics