Can we fine-tune right away after add_tokens?


Following the documentation, I could add new token to a Bert tokenizer:…/add_tokens

New tokens should help Bert in my ModelForSequenceClassification task, but actually, not really. Without the new tokens, the model perfomed better after some epochs of training.

I wonder if I am not missing a step here. Should I first go back to a MLM task using my training dataset, and then fine-tune for my classification?