Creating tokenizer from counts file?

I would like to train a wordpiece tokenizer from scratch from a counts file with tokens and counts.