ONNX Conversion - transformers.onnx vs convert_graph_to_onnx.py

Hi @meggers.

I’m like you: I know how to use the old method (transformers/convert_graph_to_onnx.py) but not the new one (transformers.onnx) to get the quantized onnx version of a Hugging Face task model (for example: a Question-Answering model).

In order to illustrate it, I did publish this notebook in Colab: ONNX Runtime with transformers.onnx for HF tasks models (for example: QA model) (not only with transformers/convert_graph_to_onnx.py)

Hope that @lysandre @mfuntowicz @valhalla @lewtun will have some time to complete the online documentation Exporting transformers models and/or to update microsoft tutorials about onnx.

Others topics about this subject:

There is as well the Accelerate Hugging Face models page from microsoft but the notebooks look very complicated (heavy code).