Generating multiple sequences with `Trainer.predict()`

AfonsoSousa · February 17, 2023, 2:15pm

Hi. I want to generate multiple sentences using Trainer.predict(), but cannot do so.

predict_results = trainer.predict(
    input_dataset["validation"],
    max_length=data_args.val_max_target_length,
    num_beams=generate_args.num_return_sequences * generate_args.beam_width,
    num_return_sequences=generate_args.num_return_sequences,
    num_beam_groups=generate_args.num_beam_groups,
    repetition_penalty=generate_args.repetition_penalty,
    diversity_penalty=generate_args.diversity_penalty,
    early_stopping=generate_args.early_stopping,
)

The above code produces as many sequences as there are entries in the dataset, but I wanted multiple per entry. Additionally, lets say I use a num_return_sequences of 4, the first 4 generations will be respective to sentence 1, the next 4 respective to sentence 2, up until we have len(dataset) sentences. How can I fix this issue? Is this a bug?

Topic		Replies	Views
Can trainer.predict() return multiple generations for each sample? 🤗Transformers	2	763	January 18, 2022
Batch_decode does not give the correct output as generate 🤗Transformers	0	300	March 17, 2022
[Urgent] trainer.predict() and model.generate creates totally different predictions 🤗Transformers	4	6908	February 1, 2021
Difference in trainer.predict() and model.generate() for LM 🤗Transformers	0	1788	July 5, 2023
Generate multiple summaries 🤗Transformers	0	531	November 25, 2021

Generating multiple sequences with `Trainer.predict()`

Related topics