RuntimeError when loading custom ONNX model exported from Whisper

I’ve successfully exported a custom Whisper model to ONNX format using the example code provided in the Custom Export of Transformers Models documentation. The export process completed without errors, and the model was saved as custom_whisper_onnx.

However, when trying to load the model with the following code:

processor = AutoProcessor.from_pretrained("custom_whisper_onnx")
model = ORTModelForSpeechSeq2Seq.from_pretrained("custom_whisper_onnx")

I encounter a RuntimeError:

RuntimeError: Could not find the past key values in the provided model.

The traceback points to the initialization of the ORTModelForSpeechSeq2Seq class.

When I try to load a pre-exported model by Hugging Face, such as optimum/whisper-tiny.en, it works without any issues:

processor = AutoProcessor.from_pretrained("optimum/whisper-tiny.en")
model = ORTModelForSpeechSeq2Seq.from_pretrained("optimum/whisper-tiny.en")