How to wrap around a non-neural LM and make it fully Hugging Face compatible?

My goal is to seamlessly integrate a non-neural LM into the Hugging Face ecosystem. The system, MBLM, implements a fast approximate k-NN next word predictor that can run in autoregressive (CausalLM) mode. Internally the core next-word prediction step produces a probability distribution over tokens that could be exposed to the outside.

I’ve been looking into writing a custom version of a PreTrainedModel and was looking for some guidelines when the model is truly non-neural (but functionally compatible as sketched above).

Shameless plug: this is a CPU-only eco-friendly LLM alternative with great scaling abilities. Incremental learning, fast, explicit memorization of training data.

Thanks for sharing tips!

Antal

1 Like

In an extreme case, if you inherit from the existing Transformers model or the model’s base class and replace everything except for some functions such as loading-related and call with dummies, it should work…

It’s not Transformers, but SentenceTransformers, so it’s not directly related to the know-how, but I remembered that a (probably) non-neural network model was introduced, so I’m writing about it.

1 Like