I use a custom normalizer with my Tokenizer, and it seems there is no way to save the tokenizer with custom components.
I don’t see another way than using a custom
PreTrainedTokenizerFast which defines custom components on instance initialization. That is not convenient since I have to share this component every time I need to use it.
Having a universal serialized tokenizer would be much easier to handle tokenization tasks. Are there any existing workarounds to save custom components or at least plans to support custom components serialization?