Replace special [unusedX] tokens in a tokenizer to add domain-specific words

Let’s say I have domain-specific word that I want to add to the tokenizer I am using for fine-tuning a model further. Tokenizer for BERT is one of those tokenizers that has [unusedX] tokens. One of the ways to add new tokens is by using add_tokens or add_special_tokens method. E.g

tokenizer = BertTokenizerFast.from_pretrained('bert-base-uncased')
tokenizer2 = tokenizer._tokenizer
# [101, 30522, 102]

However, this increases the length of tokenizer as it assigns new id to the newly added word. BERT tokenizer has almost 1000 unused tokens that can be used for this purpose. However I haven’t found an example or a documentation that shows how to achieve that.

P.S Tried using
tokenizer.vocab['DomainSpecificWord'] = tokenizer.vocab.pop('[unused701]') but didn’t work