Is there a way to implement a custom post-processor?
On this forum, I did see people discussing how to implement a custom pre-tokenizer and normalizer, but I haven’t seen any posts discussing the post-processor.
My first guess was to subclass PostProcessor
and implement num_special_tokens_to_add
and process
, but trying to initialize the subclass just gives a TypeError: No constructor defined
.
(Assuming it has something to do with the Rust-Python binding and not having a __new__
method)
So, does the HuggingFace tokenizers library support custom post-processing? If so, is there any instruction or example code available?
Thank you.