When training Llama for sequence classification, should the final token be an EOS?

dblakely · September 11, 2023, 4:57pm

LlamaForSequenceClassification uses the last token in order to do the classification, as other causal models (e.g. GPT-2) do.

Should this “last token” be an EOS or simply the final token in the input without an EOS? My interpretation is that it should not be an EOS, because otherwise, it would probably say that explicitly. Plus many people use the EOS as the pad token, in which case it would be identical to not using the EOS as the “last token” for sequence classification.

However, I’m not certain so I’d appreciate it if anyone knew!

dblakely · September 18, 2023, 3:42pm

@ArthurZ would you happen to know the answer to this? Thanks!

ArthurZ · September 18, 2023, 9:16pm

This is a standard module, Llama was not trained for LlamaForSequenceClassification so you should use whatever you want no?

Topic		Replies	Views
How was LlamaForSequenceClassification Pretrained 🤗Transformers	0	296	July 15, 2023
How does Llama For Sequence Classification determine what class corresponds to what label? 🤗Transformers	10	4986	May 25, 2025
Llama2 pad token for batched inference Models	7	15589	March 31, 2024
Padding Token Missing from LLaMA Models	1	180	April 17, 2025
How to actually use padding in Lllama Tokenizers 🤗Transformers	2	4921	June 16, 2023

When training Llama for sequence classification, should the final token be an EOS?

Related topics