Unknown character although it is present in the vocabulary list


I am using the m2m100 model but I noticed that some characters in the vocabulary list (data_dict128k.txt) are marked as unknown by the model when translating, is there any way to make it add these vocabularies?

eg. this char ▬

from transformers import M2M100ForConditionalGeneration, M2M100Tokenizer

model = M2M100ForConditionalGeneration.from_pretrained("facebook/m2m100_418M")
tokenizer = M2M100Tokenizer.from_pretrained("facebook/m2m100_418M")

while True:
  text = input("Entrez un texte: ")
  tokenizer.src_lang = "en"
  encoded_hi = tokenizer(text, return_tensors="pt")
  generated_tokens = model.generate(**encoded_hi, forced_bos_token_id=tokenizer.get_lang_id("fr"))
  print("\n\n"+tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)[0])
▬▬Hello▬ world▬


<unk> <unk> Monde <unk>

How can I make it add this character to the translation, moreover I would like the model to be able to use emojis, so I would like to know how I can train the model globally without going to each prefix of a language? (There are more than 9000 of them).

Thank you for your future response.