The num_return_sequences parameter in model.generate does not return unique outputs

If I have a BART model, for example, and I run this:

bart_output = BART.generate(input, temperature = 1.0, num_return_sequences = 3, num_beams=5, do_sample=True)

I find that all 3 outputs are almost always exactly the same. However, if I were to call this function 3 independent times such as

[BART.generate(input, temperature = 1.0, num_return_sequences = 1, num_beams=5, do_sample=True) for _ in range(3)] I will often get unique outputs

Can anyone explain what’s going on here? This hacky solution is very inefficient!