Unique text generation per-GPU


I had a quick question about properly generating text on multiple GPUs.
Right now I am generating texts by splitting GPT-j-6B across a few nodes.

Usually I set a fixed seed for my jobs. When I do that, the generated output text from each job is identical so it only makes sense to save one of them. I believe that is done in the last step of this deepspeed example: deepspeed for GPT-Neo-2.7B

But when I don’t set an explicit seed each job gets different RNG settings. Thanks to this I instead get as many unique text outputs as there are GPUs.

Is not setting the shared seed a valid way to “cheat” and increase text generation throughput? Or should seeds be set for another reason (i.e. something in synced_gpus=True or deepspeed needs it).

Apologies if this has been asked before I could not find it via keywords.

Thank you!