I have found a post that could explain this: Possible Bug with KV Caching in Llama (original) model 路 Issue #25420 路 huggingface/transformers 路 GitHub
In short: Using KV cache will change the logits, especially when the model is loaded in 16-bit precision.