Difficulty putting simple sentence-piece averaging model on Hub


My friend and I wanted to move a model from GitHub - jwieting/paraphrastic-representations-at-scale into HF Hub with an inference API for making text embeddings. The tricky thing is that this is not a Transformers model (it is just averaging embeddings from a sentencepiece model - so could perhaps be viewed as a 0-layer Transformer without positional embeddings perhaps).

We have an initial attempt here:

The issue is that we created a custom tokenizer as a subclass of WordTokenizer (abstract class) using the code from the repo, but do not know how to add that to the hub. Ready-to-use options are WhitespaceTokenizer and PhraseTokenizer, but these are not appropriate as they are not sentencepiece models.

What is the best way to proceed here? Ideally we would not have to add any code to these repos, but happy to do so if it is the best wary forward. Any advice/support would be great!

Thank you!