Transformers.onnx vs optimum.onnxruntime

Maxinho · September 5, 2022, 3:56pm

Hello. I am interested in converting a model to ONNX to get faster inference, but I saw there are two possible approaches:

Using transformers.onnx package: Export 🤗 Transformers Models
Using optimum.onnxruntime package: Optimum Inference with ONNX Runtime

Should I convert the model to ONNX with the first and then use it with Optimum? It looks like Optimum can convert models to ONNX by its own now, so what is the point of transformers.onnx package?

Jingya · September 12, 2022, 10:09am

Hi @Maxinho,

ORTModel APIs in Optimum manage the conversion of models from PyTorch to ONNX(we currently use the export in transformers.onnx) when it is needed, and implement the inference for different tasks so that you can use it just like using AutoModel APIs in Transformers.

In terms of acceleration, Optimum offers ORTOptimizer and ORTQuantizer, with which you can optimize your computation graph and quantize your ONNX model so that you can accelerate even more the inference.

Topic		Replies	Views
Supporting ONNX optimized models 🤗Transformers	16	3074	September 1, 2021
Optimize large scale transformer model inference with ONNX Runtime Models	0	387	January 18, 2022
Onnx Vs Optimum 🤗Optimum	1	1379	June 28, 2022
Getting ValueError when exporting model to ONNX using optimum 🤗Optimum	16	5121	November 25, 2022
ONNX model created with Optimum is not compatible with Transformers.js Beginners	0	1063	April 5, 2024

Transformers.onnx vs optimum.onnxruntime

Related topics