T5: classification using text2text?

valhalla · September 25, 2020, 2:52pm

forward as in without using generate?

As T5 is trained using text-2-text approach we need to generate the output as text either manually calling forward or using generate. If we wish to do this as discriminative task we could take the same approach as BART where we feed the same text to both encoder and decoder, pool the hidden states of the final eos token and pass that to a classification head, this is how BartForSequecneClassfication works. Not sure how this will work for T5, haven’t tried myself.

To answer the original question, you could use forward as shown below to generate the output

import torch
import torch.nn.functional as F
from transformers import T5ForConditionalGeneration, T5Tokenizer

tokenizer = T5Tokenizer.from_pretrained("t5-small")
model = T5ForConditionalGeneration.from_pretrained("t5-small")
model.eval()

text = "sst2 sentence: it confirms fincher ’s status as a film maker who artfully bends technical know-how to the service of psychological insight"
with torch.no_grad():
  enc = tokenizer(text, return_tensors="pt")
  decoder_input_ids = torch.tensor([tokenizer.pad_token_id]).unsqueeze(0) 
  logits = model(**enc, decoder_input_ids=decoder_input_ids)[0]
  tokens = torch.argmax(logits, dim=2)
  sentiments = tokenizer.batch_decode(tokens)
  # 'positve'

Now if we wish to measure the probabilities, as I described in the earlier comment, we could only take the logits of positive and negative token and apply softmax on it. Thankfully T5 encodes positive and negative as single tokens so it’s easy to do. The token id for positive is 1465 and for negative 2841.

logits = logits.squeeze(1)
# only take the logits of positive and negative
selected_logits = logits[:, [1465, 2841]] 

probs = F.softmax(selected_logits, dim=1)
#=> tensor([[0.9820, 0.0180]])

Hope this answers your question.

cc @sshleifer

Topic		Replies	Views
Finetuning T5 for multi class classification Intermediate	0	951	January 6, 2022
How to get the result probabilities fromT5 decoding output? 🤗Transformers	1	1001	October 30, 2022
T5 for classification task 🤗Transformers	0	488	April 25, 2023
How to get the probabilities of each class when we use T5 or Flan Awesome paper	0	391	January 23, 2024
T5ForConditionalGeneration, How to get prediction probabilities or logits at the inference time? (to calculate perplexity) 🤗Transformers	0	691	April 5, 2022

T5: classification using text2text?

Related topics