forward as in without using generate
?
As T5 is trained using text-2-text approach we need to generate the output as text either manually calling forward
or using generate
. If we wish to do this as discriminative task we could take the same approach as BART
where we feed the same text to both encoder
and decoder
, pool the hidden states of the final eos
token and pass that to a classification head, this is how BartForSequecneClassfication
works. Not sure how this will work for T5, haven’t tried myself.
To answer the original question, you could use forward as shown below to generate the output
import torch
import torch.nn.functional as F
from transformers import T5ForConditionalGeneration, T5Tokenizer
tokenizer = T5Tokenizer.from_pretrained("t5-small")
model = T5ForConditionalGeneration.from_pretrained("t5-small")
model.eval()
text = "sst2 sentence: it confirms fincher ’s status as a film maker who artfully bends technical know-how to the service of psychological insight"
with torch.no_grad():
enc = tokenizer(text, return_tensors="pt")
decoder_input_ids = torch.tensor([tokenizer.pad_token_id]).unsqueeze(0)
logits = model(**enc, decoder_input_ids=decoder_input_ids)[0]
tokens = torch.argmax(logits, dim=2)
sentiments = tokenizer.batch_decode(tokens)
# 'positve'
Now if we wish to measure the probabilities, as I described in the earlier comment, we could only take the logits
of positive
and negative
token and apply softmax
on it. Thankfully T5 encodes positive
and negative
as single tokens so it’s easy to do. The token id for positive is 1465 and for negative 2841.
logits = logits.squeeze(1)
# only take the logits of positive and negative
selected_logits = logits[:, [1465, 2841]]
probs = F.softmax(selected_logits, dim=1)
#=> tensor([[0.9820, 0.0180]])
Hope this answers your question.
cc @sshleifer