Supporting ONNX optimized models

dataiku · October 14, 2020, 8:03am

Hi there,

Are there any plans for Huggingface to distribute pre-trained models as ONNX files? My use case is embedded pre-trained model inference.

I don’t necessarily need the raw pytorch or tensorflow model. The default ONNX quantized export would be enough for me.

I can do that myself using a combination of the huggingface download utils and the ONNX conversion script. Ideally, I would get the ONNX file directly from

[ADDITION] It would probably not be as simple as “get the ONNX file” if we need a plug-and-play experience. Happy to brainstorm on the right format!

Cheers,

Alex Combessie

Topic		Replies	Views
Transformers.onnx vs optimum.onnxruntime 🤗Optimum	1	1195	September 12, 2022
Optimize large scale transformer model inference with ONNX Runtime Models	0	396	January 18, 2022
Optimizing models using ONNX Models	1	1134	October 21, 2020
Looking for help converting transformers to ONNX with HF Optimum 🤗Transformers	0	286	November 9, 2023
:rocket: Optimum Transformers: accelerated NLP pipelines with Infinity speed 🤗Transformers	4	680	March 25, 2022

Supporting ONNX optimized models

Related topics