How to get probability of the first generated token?

When doing conditional generation with (m)BART how can I get the probability of the first generated token? I would like to use it as confidence to filter my results. (I generate really short summaries which are really extracted answers from text). The generate method only generates token ids and I could not find out which number to use from the raw model output (after softmax on dim=-1). Also why does it have dimensions like batch_size x 638 x 250027? 250027 is the vocabulary size I guess but what is 638? I thought it should be max_source_length which is 1024 (assuming max output length equals max input length).

Hi marton,

As you can see in the doc

  • You should apply argmax rather than softmax on the output
  • 638 is the actual sequence length, usually we pad sentences just to the length of the longest sequence in the batch.

What you have got is called prediction_scores, now you can do

predicts = prediction_scores.argmax(dim=-1) # (batch_size, sequence_length) dtype=torch.Long
scores = prediction_scores[predicts] # (batch_size, sequence_length) dtype=torch.float

scores now is the probability of all predicted tokens, if you just want the first token of every sentence, just modify the code above.

Yes, I applied argmax. It’s just generation requires for example a special starting token to be fed to the model and I could not find an example in the doc for generation without the .generate() function which only returns the tokens. And the first predicted token and the token generated after I fed BOS to the decoder are sometimes different.