I am using BartTokenizerFast to encode & decode my dataset.
I can find the documentation for using BPE unit for TokenizerFast
(including
https://colab.research.google.com/github/huggingface/transformers/blob/master/notebooks/01-training-tokenizers.ipynb)
Is there any useful documentation for using other units (e.g., wordpiece, word, character) for TokenizerFast?
Thank you !