Resolved [here]
(Is non-determinism in outputs generated by LlamaForCausalLM, the expected behavior? · Issue #25507 · huggingface/transformers · GitHub)
TLDR: set do_sample = False
in model.generate(inputs.input_ids, max_length=30)
as in model.generate(inputs.input_ids, max_length=30,do_sample=False)