TFBertModel for classification task with no CLS token

BaroneRampante · March 11, 2023, 6:21pm

Hello, I’m reading a paper where BERT (TFBertModel) and RoBerta (TFRobertaModel) are used to solve a text classification task.

Preamble
Going through the implementation, I noticed that each text sample is tokenized with no special tokens tokenizer.encode(sentence, add_special_tokens=False).

Later on the outputs of the tokenizers are passed to the respective models and the pooled output is retrieved, as follows:

embedding_BERT = encoder_BERT(
    input_ids_BERT,
    token_type_ids=token_type_ids_BERT,
    attention_mask=attention_mask_BERT
)['pooler_output']

Questions

The authors claim to be using the [CLS] tokens produced by both models. However, how can this be the case if the tokenizers encoded the text samples without including the special tokens?
If add_special_tokens is False, does the first token of each text sample still encode knowledge about the whole sequence as it usually is the case with [CLS]?
The authors actually use the pooled output, which is produced by BertPooler. Can its output still be considered as the CLS token?

References
paper: https://ceur-ws.org/Vol-3202/politices-paper1.pdf
code: PoliticES2022/PoliticES.ipynb at main · ssantamaria94/PoliticES2022 · GitHub

Topic		Replies	Views
Disabling addition of CLS from BERT tokenizer 🤗Tokenizers	5	1784	March 11, 2022
Significance of the [CLS] token Research	16	28585	September 5, 2024
Does AutoTokenizer.from_pretrained add [cls] tokens? 🤗Tokenizers	7	5296	March 2, 2021
Which token vector is used for Sentiment Analysis? Beginners	2	341	February 16, 2024
Pool [CLS] token from DistilBERT 🤗Transformers	1	793	January 18, 2022

TFBertModel for classification task with no CLS token

Related topics