How to get the result probabilities fromT5 decoding output?

Hello,

I am using t5-base to map phrases into categories, for example: “I want to eat” → “hunger”. I have hundreds of different categories, and each category may have 1-3 phrases. Therefore, this is not a classical multi-class classification task. I rather call it a text-to-text mapping task.

Is there any way to get the probability for result values returned for a phrase (see code snippet below)?

For example, if the phrase is “He is hungry”, the model returns top 5 most relevant labels. These results seem to be ordered by some relevance rank, so that the most relevant label is always first in outputs . So, my question is how can I retrieve these probabilities?

My final goal is to set a threshold on the probability, so that outputs would only include results that pass this threshold. If the threshold is not passed, then it should mean that nothing relevant found.

t5_tokenizer = T5Tokenizer.from_pretrained('t5-base')
t5_model = T5ForConditionalGeneration.from_pretrained('t5-base')
...
model.model.eval()
outputs = model.model.generate(
    input_ids=test_input_ids,attention_mask=test_attention_mask,
    max_length=64,
    early_stopping=True,
    num_beams=10,
    num_return_sequences=5,
    no_repeat_ngram_size=2
)

for output in outputs:
    result = t5_tokenizer.decode(output, skip_special_tokens=True, clean_up_tokenization_spaces=True)
    print(result)

Hi Liana, did you figure out how to do this?