How to add additional custom pre-tokenization processing?

Are there plans for this to become a documented part of the API? I notice that the CustomDecoder code no longer works (I believe the method name changed), it would be great to have a stable API for this stuff (although I get it’s a pretty niche thing)