TFBertModel for classification task with no CLS token

Hello, I’m reading a paper where BERT (TFBertModel) and RoBerta (TFRobertaModel) are used to solve a text classification task.

Preamble
Going through the implementation, I noticed that each text sample is tokenized with no special tokens tokenizer.encode(sentence, add_special_tokens=False).

Later on the outputs of the tokenizers are passed to the respective models and the pooled output is retrieved, as follows:

embedding_BERT = encoder_BERT(
    input_ids_BERT,
    token_type_ids=token_type_ids_BERT,
    attention_mask=attention_mask_BERT
)['pooler_output']

Questions

  1. The authors claim to be using the [CLS] tokens produced by both models. However, how can this be the case if the tokenizers encoded the text samples without including the special tokens?
  2. If add_special_tokens is False, does the first token of each text sample still encode knowledge about the whole sequence as it usually is the case with [CLS]?
  3. The authors actually use the pooled output, which is produced by BertPooler. Can its output still be considered as the CLS token?

References
paper: https://ceur-ws.org/Vol-3202/politices-paper1.pdf
code: PoliticES2022/PoliticES.ipynb at main · ssantamaria94/PoliticES2022 · GitHub