T5 for classification task


I’m confused with training t5 for classification task

I’ve looked up lots of examples showing how to train T5 model for classification

In sentiment classification, let’s say the answer labels are positive and negative.
In this case, fortunately, positive and negative are tokenized into only one length(input_ids).
Therefore, after passing the lm_head, we just check and compare the logits of two of them( positive, negative)

But in NLI classification, the answer labels are contradiction, entailment, and neutral.
Unfortunately, those labels can not be tokenized into only one length.
(I’m not sure but let’s assume contradiction and entailment are tokenized into 3 length)
In this case, I can not compare logits just like the example above.


  1. In the last case, should I add contradiction, entailment, and neutral tokens to tokenizer so that I could train the model just like the first case(positive, negative classification case)

  2. I think I could generate labels. that’s just the way t5 works. But I’m not familiar with this procedure.
    please explain this with simple codes(especially, loss part)

  3. quetion1 and question2 is the same after all?
    if not, which one is better?

Thank you!