Shape mismatching between `sequences` and `scores` in beam search generation

Hello,

I’m trying to use the scores returned from generate() method, I want to take the returned sequences and scores but the number of steps(sequence length) is slightly different between them.
Currently the model I used to generate text is mT5, here is the arguments passed to generate method:

        max_length=None,
        min_length=None,
        do_sample=False,
        early_stopping=True,
        num_beams=3,
        temperature=1.0,
        top_k=None,
        top_p=None,
        length_penalty=1.0,  # > 1.0 longer sequences, < 1.0 shorter sequences
        num_return_sequences=1,
        max_time=None,  # in seconds
        num_beam_groups=1,
        output_scores=True,
        return_dict_in_generate=True,

let’s say I have text1, then returned shapes are:

  • sequences shape: [1, 69],
  • scores shape(after applying torch.stack): [3, 68, vocab_size]

for another text2:

  • sequences shape: [1, 72],
  • scores shape(after applying torch.stack): [3, 72, vocab_size]

I know it’s just single step difference, but it could violate the alignment of tokens scores.
Is there something to do to fix this? what tokens should I ignore to align them correctly?

Regards

I noticed that the problem come from the num_return_sequences argument, it’s lower than the number of beams(num_beams), if they are the same then the returned shapes will match. I don’t know why this happens? when num_return_sequences=1 why it doesn’t return the first beam sequence scores?