Save custom components

Hi,

I use a custom normalizer with my :hugs: Tokenizer, and it seems there is no way to save the tokenizer with custom components.

I don’t see another way than using a custom PreTrainedTokenizerFast which defines custom components on instance initialization. That is not convenient since I have to share this component every time I need to use it.

Having a universal serialized tokenizer would be much easier to handle tokenization tasks. Are there any existing workarounds to save custom components or at least plans to support custom components serialization?

Best,
Eugene

1 Like