T5: classification using text2text?

forward as in without using generate?

As T5 is trained using text-2-text approach we need to generate the output as text either manually calling forward or using generate. If we wish to do this as discriminative task we could take the same approach as BART where we feed the same text to both encoder and decoder, pool the hidden states of the final eos token and pass that to a classification head, this is how BartForSequecneClassfication works. Not sure how this will work for T5, haven’t tried myself.

To answer the original question, you could use forward as shown below to generate the output

import torch
import torch.nn.functional as F
from transformers import T5ForConditionalGeneration, T5Tokenizer

tokenizer = T5Tokenizer.from_pretrained("t5-small")
model = T5ForConditionalGeneration.from_pretrained("t5-small")
model.eval()

text = "sst2 sentence: it confirms fincher ’s status as a film maker who artfully bends technical know-how to the service of psychological insight"
with torch.no_grad():
  enc = tokenizer(text, return_tensors="pt")
  decoder_input_ids = torch.tensor([tokenizer.pad_token_id]).unsqueeze(0) 
  logits = model(**enc, decoder_input_ids=decoder_input_ids)[0]
  tokens = torch.argmax(logits, dim=2)
  sentiments = tokenizer.batch_decode(tokens)
  # 'positve'

Now if we wish to measure the probabilities, as I described in the earlier comment, we could only take the logits of positive and negative token and apply softmax on it. Thankfully T5 encodes positive and negative as single tokens so it’s easy to do. The token id for positive is 1465 and for negative 2841.

logits = logits.squeeze(1)
# only take the logits of positive and negative
selected_logits = logits[:, [1465, 2841]] 

probs = F.softmax(selected_logits, dim=1)
#=> tensor([[0.9820, 0.0180]])

Hope this answers your question.

cc @sshleifer

2 Likes