When exporting seq2seq models with ONNX, why do we need both decoder_with_past_model.onnx and decoder_model.onnx?

Hi @Xenova , thank you for having a try at this!

I have the same experience as you for gpt2 (although I had to add position_ids as inputs to have matching logits due to this logic) using only decoder_with_past_model.onnx.

Unfortunately I did not have time to try with encoder-decoder models, I was assuming it was possible. I could have a look shortly.

Alternatively, are you able to support ONNX models that have subgraphs? That’s the approach we are currently taking in Optimum, for reference: Validating ONNX model fails for GPT-J · Issue #607 · huggingface/optimum · GitHub . This is currently available only for decoder-only models for now though, I plan to extend to encoder-decoder architectures.

If you export a decoder-only model you’ll see as output a merged decoder in charge of both cases without past/with past: optimum-cli export onnx gpt2 gpt2_onnx/