Supporting ONNX optimized models

Hi there,

Are there any plans for Huggingface to distribute pre-trained models as ONNX files? My use case is embedded pre-trained model inference.

I don’t necessarily need the raw pytorch or tensorflow model. The default ONNX quantized export would be enough for me.

I can do that myself using a combination of the huggingface download utils and the ONNX conversion script. Ideally, I would get the ONNX file directly from :hugs:

[ADDITION] It would probably not be as simple as “get the ONNX file” if we need a plug-and-play experience. Happy to brainstorm on the right format!

Cheers,

Alex Combessie

2 Likes