Hey @PastelBelem8, have you figured out how to get the log probabilities of any given sequence yet?

I’m trying to implement the same thing using the T5 model. My approach was to use `.compute_transition_scores()`

and reconstruct the sequence score.

As written in the documentation, `.compute_transition_scores()`

takes `sequences`

and `scores`

and returns the transition score for each token id in the sequences provided. The snippet below takes the output of the `model.generate()`

and get transitions scores on the generated example. However, we could pass token_ids for the wanted sequence to `.compute_transition_scores()`

instead of `outputs.sequences`

.

```
from transformers import AutoModelWithLMHead, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("llenai/unifiedqa-t5-small")
model = AutoModelWithLMHead.from_pretrained("llenai/unifiedqa-t5-small")
inputs = tokenizer(["question: <q> context: <c>"], return_tensors="pt")
outputs = model.generate(inputs.input_ids, return_dict_in_generate=True, output_scores=True)
# instead of this
transition_scores = model.compute_transition_scores(
outputs.sequences, outputs.scores, normalize_logits=False
)
# do this
wanted_seq = tokenizer.batch_encode_plus(["wanted sequence"], return_tensors="pt").input_ids
wanted_seq = torch.cat([torch.tensor([[0]]), tar_input_ids], dim=1)
transition_scores = model.compute_transition_scores(
wanted_seq, outputs.scores, normalize_logits=False
)
```

However, I’ve realized when the length of the wanted_seq > length of the generated sequence from `.generate()`

, `outputs.scores`

does not cover the full length of the wanted_sequence input_ids. So I’ve been trying to modify the behavior of `.generate()`

to get the full length of scores by passing `stopping_criteria`

. However, I still haven’t figured out how to do so.

What I’d like to have is the scores of the length I wanted (up to 32 in the case below, without early stopping). Can @patrickvonplaten help me with this? I’ve tried below but it’s throwing me an error.

```
from transformers.generation.stopping_criteria import StoppingCriteriaList, MaxLengthCriteria
stopping_criteria = StoppingCriteriaList([MaxLengthCriteria(32)])
outputs = model.generate(
inputs['input_ids'].cuda() if torch.cuda.is_available() else inputs["input_ids"],
num_beams=args.num_beams,
max_length=32,
stopping_criteria=stopping_criteria,
early_stopping=False,
output_scores=True,
return_dict_in_generate=True,
)
```