Hey everyone

We have just merged a PR that exposes a new function related to `.generate()`

, `compute_transition_scores`

. With this function, you can quickly solve any problem that requires the probabilities of generated tokens, for any generation strategy. It is also nicely documented â€“ see here.

How can this function help you? Let me give you two simple examples!

## Example 1 -- print the probabilities for the output generated by Greedy Search

```
from transformers import GPT2Tokenizer, AutoModelForCausalLM
import numpy as np
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = AutoModelForCausalLM.from_pretrained("gpt2")
tokenizer.pad_token_id = tokenizer.eos_token_id
inputs = tokenizer(["Today is"], return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=5, return_dict_in_generate=True, output_scores=True)
transition_scores = model.compute_transition_scores(
outputs.sequences, outputs.scores, normalize_logits=True
)
input_length = inputs.input_ids.shape[1]
generated_tokens = outputs.sequences[:, input_length:]
for tok, score in zip(generated_tokens[0], transition_scores[0]):
# | token | token string | logits | probability
print(f"| {tok:5d} | {tokenizer.decode(tok):8s} | {score.numpy():.4f} | {np.exp(score.numpy()):.2%}")
# Expected output:
#| 262 | the | -1.4136 | 24.33%
#| 1110 | day | -2.6089 | 7.36%
#| 618 | when | -2.0096 | 13.40%
#| 356 | we | -1.8593 | 15.58%
#| 460 | can | -2.5083 | 8.14%
```

## Example 2 -- recompute the sequence scores from Beam Search

```
from transformers import GPT2Tokenizer, AutoModelForCausalLM
import numpy as np
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = AutoModelForCausalLM.from_pretrained("gpt2")
tokenizer.pad_token_id = tokenizer.eos_token_id
inputs = tokenizer(["Today is"], return_tensors="pt")
outputs = model.generate(
**inputs,
max_new_tokens=5,
num_beams=4,
num_return_sequences=4,
return_dict_in_generate=True,
output_scores=True,
)
transition_scores = model.compute_transition_scores(
outputs.sequences, outputs.scores, outputs.beam_indices, normalize_logits=False
)
# If you sum the generated tokens' scores and apply the length penalty, you'll get the sequence scores.
# Tip: set `normalize_logits=True` to recompute the scores from the normalized logits.
output_length = inputs.input_ids.shape[1] + np.sum(transition_scores.numpy() < 0, axis=1)
length_penalty = model.generation_config.length_penalty
reconstructed_scores = transition_scores.sum(axis=1) / (output_length**length_penalty)
print(np.allclose(outputs.sequences_scores, reconstructed_scores))
# Expected output:
#True
```

There is also an interactive demo here that makes use of these functionalities to color-code generated text according to the probabilities

Let me know if you have comments, questions, and/or suggestions!

(P.S.: This new post is also meant as a replacement for an older one, which contains stale examples. Letâ€™s keep all further discussion around generated token probabilities here!)