I use BertTokenizer.from_pretrained(‘file path’); The file is my manually write the vaocb.txt. , It’s have FutureWarning.How can I continue from my handwritten vocab.txt to load my own tokenizer?
FutureWarning: Calling BertTokenizer.from_pretrained() with the path to a single file or url is deprecated and won't be possible anymore in v5. Use a model identifier or the path to a directory instead.
vacob.txt is this:
[PAD]
[UNK]
[CLS]
[SEP]
[MASK]
...
my_word1
my_word2
my_word3
My goal is to train a new “language”, in which Tokenizer needs to be manually specified, rather than directly loading from the pretrained model.