When training Llama for sequence classification, should the final token be an EOS?

The doc string says:

LlamaForSequenceClassification uses the last token in order to do the classification, as other causal models (e.g. GPT-2) do.

Should this “last token” be an EOS or simply the final token in the input without an EOS? My interpretation is that it should not be an EOS, because otherwise, it would probably say that explicitly. Plus many people use the EOS as the pad token, in which case it would be identical to not using the EOS as the “last token” for sequence classification.

However, I’m not certain so I’d appreciate it if anyone knew!

@ArthurZ would you happen to know the answer to this? Thanks!

This is a standard module, Llama was not trained for LlamaForSequenceClassification so you should use whatever you want no?