Hey everyone
We have just merged a PR that exposes a new function related to .generate()
, compute_transition_scores
. With this function, you can quickly solve any problem that requires the probabilities of generated tokens, for any generation strategy. It is also nicely documented – see here.
How can this function help you? Let me give you two simple examples!
Example 1 -- print the probabilities for the output generated by Greedy Search
from transformers import GPT2Tokenizer, AutoModelForCausalLM
import numpy as np
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = AutoModelForCausalLM.from_pretrained("gpt2")
tokenizer.pad_token_id = tokenizer.eos_token_id
inputs = tokenizer(["Today is"], return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=5, return_dict_in_generate=True, output_scores=True)
transition_scores = model.compute_transition_scores(
outputs.sequences, outputs.scores, normalize_logits=True
)
input_length = inputs.input_ids.shape[1]
generated_tokens = outputs.sequences[:, input_length:]
for tok, score in zip(generated_tokens[0], transition_scores[0]):
# | token | token string | logits | probability
print(f"| {tok:5d} | {tokenizer.decode(tok):8s} | {score.numpy():.4f} | {np.exp(score.numpy()):.2%}")
# Expected output:
#| 262 | the | -1.4136 | 24.33%
#| 1110 | day | -2.6089 | 7.36%
#| 618 | when | -2.0096 | 13.40%
#| 356 | we | -1.8593 | 15.58%
#| 460 | can | -2.5083 | 8.14%
Example 2 -- recompute the sequence scores from Beam Search
from transformers import GPT2Tokenizer, AutoModelForCausalLM
import numpy as np
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = AutoModelForCausalLM.from_pretrained("gpt2")
tokenizer.pad_token_id = tokenizer.eos_token_id
inputs = tokenizer(["Today is"], return_tensors="pt")
outputs = model.generate(
**inputs,
max_new_tokens=5,
num_beams=4,
num_return_sequences=4,
return_dict_in_generate=True,
output_scores=True,
)
transition_scores = model.compute_transition_scores(
outputs.sequences, outputs.scores, outputs.beam_indices, normalize_logits=False
)
# If you sum the generated tokens' scores and apply the length penalty, you'll get the sequence scores.
# Tip: set `normalize_logits=True` to recompute the scores from the normalized logits.
output_length = np.sum(transition_scores.numpy() < 0, axis=1)
length_penalty = model.generation_config.length_penalty
reconstructed_scores = transition_scores.sum(axis=1) / (output_length**length_penalty)
print(np.allclose(outputs.sequences_scores, reconstructed_scores))
# Expected output:
#True
There is also an interactive demo here that makes use of these functionalities to color-code generated text according to the probabilities
Let me know if you have comments, questions, and/or suggestions!
(P.S.: This new post is also meant as a replacement for an older one, which contains stale examples. Let’s keep all further discussion around generated token probabilities here!)