Hi @awaiskaleem , transformers==4.20.1
is 5 months old. Could you try to update (pip install --upgrade transformers
)? Current supported stable version is 4.20.0
. The code snippet from @lewtun works well for me with transformers==4.20.0
and optimum==1.5.1
.
Additionally @NNDam , @double @omoekan , @Jour , @echoRG , @awaiskaleem , I wanted to let you know that the ONNX export through transformers.onnx
will likely soon rely on a soft dependency to optimum.exporters
where all things export will be maintained. You can check the documentation here.
Now, specifically for M2M100, keep in mind that it is a seq2seq (translation) model! Hence, it uses both an encoder and decoder, as detailed in transformers doc. In transformers, the standard use is to model.generate(**inputs)
. However, by default the ONNX export can not handle the loop that there is in the decoder: transformers/utils.py at d51e7c7e8265d69db506828dce77eb4ef9b72157 Ā· huggingface/transformers Ā· GitHub . Hence, when exporting to ONNX in a single file, unless you do some manual surgery on the ONNX graph, the model will be hardly usable.
The solution that is currently explored & in use in Optimumās ORTModelForSeq2SeqLM leveraging ONNX Runtime is to use two ONNX files: one for the encoder, and one for the decoder.
Using Optimum main (not yet in the stable release, but you can expect it next week), python -m optimum.exporters.onnx --model valhalla/m2m100_tiny_random --for-ort m2m100_tiny_onnx_ort
, we obtain two models:
- an encoder expecting the
input_ids
, attention_mask
- a decoder expecting
encoder_attention_mask
, input_ids
and encoder_hidden_states
. This follows closely transformers decoder and generate.
So if you would like to use these exported ONNX models outside of Optimum, I simply recommend to use the above command to export and handle yourself the models then. But ORTModelForSeq2SeqLM
is meant to save you the hassle.
If you want to try it right away, feel free to try the dev version: pip install -U git+https://github.com/huggingface/optimum.git@main
Edit 2022-12-27: Feel free to have a look at the latest release notes which includes the feature: Release v1.6.0: Optimum CLI, Stable Diffusion ONNX export, BetterTransformer & ONNX support for more architectures Ā· huggingface/optimum Ā· GitHub