When exporting seq2seq models with ONNX, why do we need both decoder_with_past_model.onnx and decoder_model.onnx?

fxmarty · March 8, 2023, 8:31pm

Hi @Xenova , thank you for having a try at this!

I have the same experience as you for gpt2 (although I had to add position_ids as inputs to have matching logits due to this logic) using only decoder_with_past_model.onnx.

Unfortunately I did not have time to try with encoder-decoder models, I was assuming it was possible. I could have a look shortly.

Alternatively, are you able to support ONNX models that have subgraphs? That’s the approach we are currently taking in Optimum, for reference: Validating ONNX model fails for GPT-J · Issue #607 · huggingface/optimum · GitHub . This is currently available only for decoder-only models for now though, I plan to extend to encoder-decoder architectures.

If you export a decoder-only model you’ll see as output a merged decoder in charge of both cases without past/with past: optimum-cli export onnx gpt2 gpt2_onnx/

Topic		Replies	Views
How does the ONNX exporter work for GenerationModel with `past_key_value`? 🤗Optimum	9	2439	February 17, 2023
Optimize an ONNX Seq2Seq model 🤗Optimum	3	1940	November 17, 2022
Question about the infernce flow for optimum exported decoder merged onnx model 🤗Optimum	4	59	October 11, 2024
Export M2M100 model to ONNX 🤗Transformers	13	3532	June 15, 2023
Default for the Decoder past_key_values - Marian Intermediate	0	410	January 5, 2023

When exporting seq2seq models with ONNX, why do we need both decoder_with_past_model.onnx and decoder_model.onnx?

Related topics