I have a BPE merges file that has been trained by another trainer. How can I convert it to the vocab format of the hf tokenizer? Because I don’t want to spend a lot of time retraining the BPE model.
I have a BPE merges file that has been trained by another trainer. How can I convert it to the vocab format of the hf tokenizer? Because I don’t want to spend a lot of time retraining the BPE model.