I want to avoid importing the transformer library during inference with my model, for that reason I want to export the fast tokenizer and later import it using the Tokenizers library.
On Transformers side, this is as easy as tokenizer.save_pretrained(“tok”), however when loading it from Tokenizers, I am not sure what to do.
from tokenizers import Tokenizer
Seems to work, but it is ignoring the two other files in the directory: tokenizer_config.json and special_tokens_map.json, for that reason I believe it won’t give me the same tokens.
Is there a way to import a tokenizer using the whole directory files ? Or better, can we import a pretrained fast tokenizer from the hub ?