Why does tokenizer.apply_chat_template() add multiple eos tokens?

swtb · September 12, 2024, 2:47pm

The models have been trained with the EOS token positioned in this way to create the illusion of a multi-turn conversation. Therefore, sticking to the prompt template used during the instruction finetune will cause the model to also stick to its expected behaviour.

The EOS token then effectively helps the model to understand that whomever was speaking is now done.

In some examples I have found that omitting the EOS token in my query caused the model to attempt to complete my query. Whereas adding the EOS token caused the model to reply to my query.

As with all things, experiment an see what happens.

Topic		Replies	Views
Multi-turn dialogue using dialoGPT with Hosted Inference API Beginners	3	1073	July 31, 2020
EncoderDecoderModel Generation with Specified EOS Token Beginners	0	288	March 15, 2021
How to prevent LLM from generating multiple rounds of conversation? Models	3	9497	February 29, 2024
How does GPT decide to stop generating sentences without EOS token? 🤗Transformers	13	24703	August 19, 2024
Why sep_token_id is same as eos_token_id for allenai/led-base-16384 Beginners	0	358	August 26, 2021

Why does tokenizer.apply_chat_template() add multiple eos tokens?

Related topics