Okay, thanks for that. I have trained my own tokenizer from scratch, so how do I use it in the masked language task?