Shape mismatching between `sequences` and `scores` in beam search generation

AbdelrahmanZ · September 13, 2022, 7:20am

Hello,

I’m trying to use the scores returned from generate() method, I want to take the returned sequences and scores but the number of steps(sequence length) is slightly different between them.
Currently the model I used to generate text is mT5, here is the arguments passed to generate method:

        max_length=None,
        min_length=None,
        do_sample=False,
        early_stopping=True,
        num_beams=3,
        temperature=1.0,
        top_k=None,
        top_p=None,
        length_penalty=1.0,  # > 1.0 longer sequences, < 1.0 shorter sequences
        num_return_sequences=1,
        max_time=None,  # in seconds
        num_beam_groups=1,
        output_scores=True,
        return_dict_in_generate=True,

let’s say I have text1, then returned shapes are:

sequences shape: [1, 69],
scores shape(after applying torch.stack): [3, 68, vocab_size]

for another text2:

sequences shape: [1, 72],
scores shape(after applying torch.stack): [3, 72, vocab_size]

I know it’s just single step difference, but it could violate the alignment of tokens scores.
Is there something to do to fix this? what tokens should I ignore to align them correctly?

Regards

AbdelrahmanZ · September 14, 2022, 6:10am

I noticed that the problem come from the num_return_sequences argument, it’s lower than the number of beams(num_beams), if they are the same then the returned shapes will match. I don’t know why this happens? when num_return_sequences=1 why it doesn’t return the first beam sequence scores?

Topic		Replies	Views
Generation scores Beginners	0	606	April 24, 2023
Beam_search and generate are not consistent 🤗Transformers	0	497	May 10, 2022
Text generation confidence 🤗Transformers	1	1329	December 7, 2022
Scores in generate() Beginners	6	10467	May 26, 2023
How to get the scores of a certain beam 🤗Transformers	0	371	June 13, 2022

Shape mismatching between `sequences` and `scores` in beam search generation

Related topics