Hi there, wondering if it will be possible to quantize Mpnet models for SentenceTransformers soon?
Hi @nickmuchi, thanks for pointing that out! We would love to add support for Mpnet and other new models, are you interested in opening a PR to add Mpnet?
Basically, to add the support for a new model there are 3 steps to follow and potentially 2 PRs in two repos will be needed:
[transformers]
- Add OnnxConfig in transformers → which enables the export to ONNX format.
- Detailed guide on adding unsupported models
- You can find more discussion here
- Register supported tasks of the model into
transformers.onnx.FeaturesManager
, by far you will be able to export the model to ONNX / useORTModels
for inference / useORTQuantizer
for quantization.
[Optimum]
- If you are also interested in using
ORTOptimizer
to do some graph optimizations, then you shall open a PR in optimum to add the support of Mpnet inORTConfigManager
Please feel free to reach out if you have any further questions, and I can guide you through this if you wish to contribute.
Hi @Jingya I have never opened a PR before but would love to if you are able to guide me please. Thanks.