How to add additional custom pre-tokenization processing?

Hi,

I was able to create a Custom Pretokenizer based on the example linked above. But I’m not able to save the tokenizer due to the exception ā€œCustom PreTokenizer cannot be serializedā€. I’m wondering how to bypass this.

4 Likes