Export M2M100 model to ONNX

I’ve port facebook/m2m100_418M to ONNX for translation task using this but when visualize by netron, it required 4 inputs: input_ids, attention_mask, decoder_input_ids, decoder_attention_mask and I don’t know how to inference with ONNX-runtime.

How can I solve this problem ?
Thanks in advance for your help.

Did you find a solution?

I have the same issue. Have you found a solution yet?

I tried to convert this model with onnx by adding this type of the task python3.8 -m transformers.onnx --model=facebook/m2m100_418M onnx/ --feature=seq2seq-lm-with-past, but in this case it says that it needs 54 inputs, otherwise I have the same problem. I know that the model needs the input and output language but I can’t really understand how to use the model with onnx. An example would be welcome :wink:

I also looked for indications in the commit of the model: M2M100 support for ONNX export by michaelbenayoun · Pull Request #15193 · huggingface/transformers · GitHub. I think it can be useful.

cc @lewtun

Also having the same question, Could I have an example for this m2m-100 onnx model? It will be very helpful.

Hi folks, the best way to run inference with ONNX models is via the optimum library. This library allows you to inject ONNX models directly in the pipeline() function from transformers and thus skip all the annoying pre- and post-processing steps :slight_smile:

Here’s a demo for M2M100 based on the docs:

from transformers import AutoTokenizer, pipeline
from optimum.onnxruntime import ORTModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("facebook/m2m100_418M")
# `from_transformers` will export the model to ONNX on-the-fly 🤯
model = ORTModelForSeq2SeqLM.from_pretrained("facebook/m2m100_418M", from_transformers=True)
onnx_translation = pipeline("translation_en_to_de", model=model, tokenizer=tokenizer)

text = "My name is Lewis."
# returns [{'translation_text': 'Mein Name ist Lewis.'}]
pred = onnx_translation(text)

Hope that helps!


Running into following error when I run code as is from @lewtun

AttributeError: type object 'FeaturesManager' has no attribute 'determine_framework'

Using following version:
torch → ‘1.10.0’
transformers → ‘4.20.1’

cc @fxmarty who might be able to take a look :pray:

1 Like

Thanks, also, not sure where is target language ‘de’ mentioned above in tokenizer/model. Greatly appreciate your help.

Hi @awaiskaleem , transformers==4.20.1 is 5 months old. Could you try to update (pip install --upgrade transformers)? Current supported stable version is 4.20.0. The code snippet from @lewtun works well for me with transformers==4.20.0 and optimum==1.5.1.

Additionally @NNDam , @double @omoekan , @Jour , @echoRG , @awaiskaleem , I wanted to let you know that the ONNX export through transformers.onnx will likely soon rely on a soft dependency to optimum.exporters where all things export will be maintained. You can check the documentation here.

Now, specifically for M2M100, keep in mind that it is a seq2seq (translation) model! Hence, it uses both an encoder and decoder, as detailed in transformers doc. In transformers, the standard use is to model.generate(**inputs). However, by default the ONNX export can not handle the loop that there is in the decoder: transformers/utils.py at d51e7c7e8265d69db506828dce77eb4ef9b72157 · huggingface/transformers · GitHub . Hence, when exporting to ONNX in a single file, unless you do some manual surgery on the ONNX graph, the model will be hardly usable.

The solution that is currently explored & in use in Optimum’s ORTModelForSeq2SeqLM leveraging ONNX Runtime is to use two ONNX files: one for the encoder, and one for the decoder.

Using Optimum main (not yet in the stable release, but you can expect it next week), python -m optimum.exporters.onnx --model valhalla/m2m100_tiny_random --for-ort m2m100_tiny_onnx_ort, we obtain two models:

  • an encoder expecting the input_ids, attention_mask
  • a decoder expecting encoder_attention_mask, input_ids and encoder_hidden_states. This follows closely transformers decoder and generate.

So if you would like to use these exported ONNX models outside of Optimum, I simply recommend to use the above command to export and handle yourself the models then. But ORTModelForSeq2SeqLM is meant to save you the hassle.

If you want to try it right away, feel free to try the dev version: pip install -U git+https://github.com/huggingface/optimum.git@main