Tokenizer - Add new Tokens

Hi,
I’m trying to use the Protein T5 Model (Rostlab/prot_t5_xl_uniref50 · Hugging Face) with some additional letters other than the traditional amino acids.

I can add these tokens to the Tokenizer through its method “add_tokens”.

But do I need to apply care when doing so? Does the order I add these tokens matter? Or the order compared to the ones present already? Do I need to reorder them somehow?

Thanks in advance.

You don’t have to apply anything just directly use tokenizer.add_tokens(<list_of_words>). Order of these tokens doesn’t matter.