which was saved as ./oyto_t5_small_onnx. See the picture below for the output of the conversion. How can I use these onnx models? I understand why I have encoder_model.onnx and decoder_model.onnx. I’m using a pre-trained t5 model which has an encoder_decoder architecture. My pain point here is I cannot use these converted onnx models.
You can run inference with the ORTModelForSeq2SeqLM class (ORT is short for ONNX runtime). Just pass your folder to the from_pretrained method. Docs: Models