Llama2 pad token for batched inference

wztwzt · August 20, 2023, 3:12pm

I find the following working very well:

tokenizer.pad_token = "[PAD]"
tokenizer.padding_side = "left"

I used to use what you had, but I found that doing batch inference with that inference gives different results compared to sequential inference, which is not supposed to happen.

Topic		Replies	Views
Padding Token Missing from LLaMA Models	1	157	April 17, 2025
How to actually use padding in Lllama Tokenizers 🤗Transformers	2	4910	June 16, 2023
Llama model outputs strange words Beginners	0	129	December 1, 2024
How to set the Pad Token for meta-llama/Llama-3 Models Models	6	11683	August 29, 2024
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation Models	5	3627	October 16, 2024

Llama2 pad token for batched inference

Related topics