Creating tokenizer from counts file?

sabharadwaj February 9, 2023, 7:37pm 1

I would like to train a wordpiece tokenizer from scratch from a counts file with tokens and counts.

Topic		Replies	Views
Train Retry Tokenizer 🤗Tokenizers	0	232	April 18, 2023
Train wordpiece from scratch 🤗Tokenizers	2	1487	September 9, 2021
SentencePieceUnigramTokenizer 🤗Tokenizers	0	723	September 22, 2022
Create a custom tokenizer from a dictionary 🤗Transformers	0	323	November 13, 2023
Index of wordpieces (subwords) after tokenization by transformers 🤗Tokenizers	0	710	August 28, 2021