Following the documentation, I could add new token to a Bert tokenizer: huggingface.co/docs/transformers/…/add_tokens
New tokens should help Bert in my ModelForSequenceClassification task, but actually, not really. Without the new tokens, the model perfomed better after some epochs of training.
I wonder if I am not missing a step here. Should I first go back to a MLM task using my training dataset, and then fine-tune for my classification?