LLaMA2 - tokenizer padding affecting logits (even with attention_mask)

Jauntbox · August 15, 2023, 7:49am

A forward pass through the model should be deterministic as far as I understand. The same input sequence should give the same logits used to get next-token prediction probabilities. The randomness I’m aware of in most LLMs is from how you decide to pick the next tokens (eg. greedy, top-k sampling, beam search, etc.) not from the forward pass through the network.

Topic		Replies	Views
Padding Token Missing from LLaMA Models	1	173	April 17, 2025
How to set the Pad Token for meta-llama/Llama-3 Models Models	6	11844	August 29, 2024
Llama model outputs strange words Beginners	0	131	December 1, 2024
Results of model.generate are different for different batch sizes of the decode-only model Beginners	6	6010	April 14, 2024
Llama2 pad token for batched inference Models	7	15582	March 31, 2024

LLaMA2 - tokenizer padding affecting logits (even with attention_mask)

Related topics