ONNX Conversion - transformers.onnx vs convert_graph_to_onnx.py

Hey Folks,

I am attempting to convert a RobertaForSequenceClassification pytorch model (fine tuned for classification from distilroberta-base) to ONNX using transformers 4.9.2. When using the transformers.onnx package, the classifier seems to be lost:

Some weights of the model checkpoint at {} were not used when initializing RobertaModel: ['classifier.dense.bias', 'classifier.out_proj.bias', 'classifier.out_proj.weight', 'classifier.dense.weight']
- This IS expected if you are initializing RobertaModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).

When I specify a feature (–feature sequence-classification) I get an error stating only default is supported. However, when I revert to the convert_graph_to_onnx script that the documentation says is being deprecated, I am able to convert successfully using the --pipeline sentiment-analysis flag.

Is this expected? Does transformers.onnx not support RobertaForSequenceClassification yet, or am I missing some step?

1 Like

Hi @meggers.

I’m like you: I know how to use the old method (transformers/convert_graph_to_onnx.py) but not the new one (transformers.onnx) to get the quantized onnx version of a Hugging Face task model (for example: a Question-Answering model).

In order to illustrate it, I did publish this notebook in Colab: ONNX Runtime with transformers.onnx for HF tasks models (for example: QA model) (not only with transformers/convert_graph_to_onnx.py)

Hope that @lysandre @mfuntowicz @valhalla @lewtun will have some time to complete the online documentation Exporting transformers models and/or to update microsoft tutorials about onnx.

Others topics about this subject:

There is as well the Accelerate Hugging Face models page from microsoft but the notebooks look very complicated (heavy code).

I’m assuming you incorrectly tagged me? I have never used ONNX and don’t work for Hugging Face like the rest of the people you tagged do.

Sorry. I edited my post and deleted your username.

1 Like

I’m having a similar issue and was wondering if you or @pierreguillou had any updates on how to convert finetuned sequence models to quantized ONNX models.

When trying the old conversion method I get an import error and it appears to be an ONNX bug. Below is the command and error:

python C:\Users\flabelle002\Anaconda3\envs\hf2\Lib\site-packages\transformers\convert_graph_to_onnx.py --framework pt --model ./local_cola_model --tokenizer bert-base-cased --quantize bert-base-uncased.onnx

Hi @pierreguillou ,

Just wondering if you know how to use the new one now? I’m trying to export a Bert model with this Code:

from transformers import AutoTokenizer
from pathlib import Path
from transformers.onnx.convert import export
from transformers.models.bert import BertConfig, BertOnnxConfig

path = Path("/Volumes/workplace/upload_content/onnx/all-MiniLM-L6-v2.onnx")
config = BertConfig()
onnx_config = BertOnnxConfig(config)

tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/all-MiniLM-L6-v2")

export(
    preprocessor=tokenizer,
    model="sentence-transformers/all-MiniLM-L6-v2",
    output=path,
    config=onnx_config,
    opset=11,
)

But it doesn’t perform anything. Not sure how can I make this work. I wondering why the newer version is much complicated than the older one…