Manually set tokenizer vocabulary

Hi everyone,

I am currently working with some LMs (GPT2, BERT and T5 like). For some experiments I would like to use a “predetermined” vocabulary. If I have the list of the words, is it possible to build a tokenizer manually setting its vocabulary with my vocabulary?

Thanks in advance!