Add_tokens + finetune

I use the huggingface Trainer class to finetune a mT5 model. Since my data (AMR graphs) contains mayn tokens like :ARG0, :op2 or :name, which are generally split into wordpieces I added those 219 tokens to the tokenizer with

from  transformers import MT5ForConditionalGeneration, MT5Tokenizer
tokenizer = MT5Tokenizer.from_pretrained("google/mt5-base")
model     = MT5ForConditionalGeneration.from_pretrained("google/mt5-base")
toks = [ ... my ist of tokens ...]
tokenizer.add_tokens(toks, special_tokens=False)
model.resize_token_embeddings(len(tokenizer))

and than starting the training.

Once the model is finetuned I run my test, and once again I add the same tokens in the same order to the tokenizer. But the result is catastrophic, it drops from 80% F1 to 50%, so evidently something is going wrong. I compared the tokenisation with and without added tokens and this looks OK. But I have not the slightest idea where to check. Can you give me a hint about the error I’m committing ?
I thought (possibly erroneously) that the added tokens will get random vectors which will be updated during the finetuning. If this is not the case, is there a way to do this? If not what is the point of adding new tokens?
Could anybody elaborate on this ?

Environment info

  • transformers version: 4.11.3
  • Platform: Linux-5.13.0-30-generic-x86_64-with-glibc2.17
  • Python version: 3.8.12
  • PyTorch version (GPU): 1.9.1
  • Jax version: not installed
  • JaxLib version: not installed
  • Using GPU in script?: yes
  • Using distributed or parallel set-up in script?: no