RobertaTokenizer: How to enable masking of custom special tokens

Hi!

I am trying to include some of my vocabulary as special tokens in RobertaTokenizer, bu t have noticed it does not mask them properly for the MLM objective:

tokenizer = RobertaTokenizer.from_pretrained(args.tokenizer_path,  additional_special_tokens=["[SPECIAL_TOK]")

t.all_special_ids → [[0, 2, 3, 2, 1, 0, 4, 32000]]

t("A test [SPECIAL_TOK] now", return_special_tokens_mask=True)

→ {'input_ids': [0, 107, 320, 32000, 37, 2], 'special_tokens_mask': [1, 0, 0, 0, 0, 1], 'attention_mask': [1, 1, 1, 1, 1, 1]}

I expect 'special_tokens_mask' to be [1, 0, 0, 1, 0, 1]. Do I just need to overwrite the RobertaForMaskedLM collator to mask my custom special tokens? Or why is this happening? For context, it rained a custom BPE with the modul:

from tokenizers.implementations import ByteLevelBPETokenizer

And I set special_tokens in there to be atomic. I also do not want these to be masked/predicted when training my LM.

I think it may be be that the term special_tokens is just overloaded in HuggingFace, and the mask is only for masking <s> and </s>