Hi. I want to generate multiple sentences using Trainer.predict()
, but cannot do so.
predict_results = trainer.predict(
input_dataset["validation"],
max_length=data_args.val_max_target_length,
num_beams=generate_args.num_return_sequences * generate_args.beam_width,
num_return_sequences=generate_args.num_return_sequences,
num_beam_groups=generate_args.num_beam_groups,
repetition_penalty=generate_args.repetition_penalty,
diversity_penalty=generate_args.diversity_penalty,
early_stopping=generate_args.early_stopping,
)
The above code produces as many sequences as there are entries in the dataset, but I wanted multiple per entry. Additionally, lets say I use a num_return_sequences
of 4, the first 4 generations will be respective to sentence 1, the next 4 respective to sentence 2, up until we have len(dataset)
sentences. How can I fix this issue? Is this a bug?