I am looking for a similar feature as in model.generate() which takes a parameter num_return_sequences. It decides how many generations should be returned for each sample. It is especially useful when using beam search and analyzing the effect of beam search on the metrics.
Trainer.predict() does not seem to support this feature. Is there a workaround? I can use model.generate() but then it was very slow last time because I have to create a for loop iterating over batches whereas trainer.predict automatically handles the data loading separating