Hi All
I am trying to find how to build a custom tokenizer for DistilBert all the examples I saw just use the pre-trained tokenizer.
Can someone point me to how to build my custom model?
Thanks in advance.
Hi All
I am trying to find how to build a custom tokenizer for DistilBert all the examples I saw just use the pre-trained tokenizer.
Hi, this is probably where you can start if you want to build a fast tokenizer: https://huggingface.co/docs/tokenizers/python/master/quicktour.html
Thanks for the answer!. I will try the example it is a bit different and I wanted to make sure I have something more similar to the DistilBert.
Thanks!
I still do not see how to build this model as distilBertTokenizer any ideas?