How can I use the ONNX model?

DeleMike · January 28, 2024, 1:52pm

I performed some tasks and my model was saved at ./oyot5_small_unyo_dcyo_mix. Later, I converted it to an onnx model using this script

optimum-cli export onnx --model oyot5_small_unyo_dcyo_mix/ --task text2text-generation oyto_t5_small_onnx/

which was saved as ./oyto_t5_small_onnx. See the picture below for the output of the conversion. How can I use these onnx models? I understand why I have encoder_model.onnx and decoder_model.onnx. I’m using a pre-trained t5 model which has an encoder_decoder architecture. My pain point here is I cannot use these converted onnx models.

What resources can help me?

Screenshot 2024-01-28 at 14.39.13

nielsr · January 28, 2024, 8:49pm

Hi,

You can run inference with the ORTModelForSeq2SeqLM class (ORT is short for ONNX runtime). Just pass your folder to the from_pretrained method. Docs: Models

See for instance the T5 ONNX model here: optimum/t5-small · Hugging Face.

DeleMike · January 29, 2024, 12:25pm

Thank you @nielsr , I extensively read the documentation you shared and came up with these codes.

from transformers import AutoTokenizer, pipeline, PretrainedConfig
from optimum.onnxruntime import ORTModelForSeq2SeqLM
import onnxruntime

# Load encoder model
encoder_session = onnxruntime.InferenceSession('oyto_t5_small_onnx/encoder_model.onnx')

# Load decoder model
decoder_session = onnxruntime.InferenceSession('oyto_t5_small_onnx/decoder_model.onnx')

model_id = "oyto_t5_small_onnx/"
tokenizer = AutoTokenizer.from_pretrained(model_id)

config = PretrainedConfig.from_json_file('oyto_t5_small_onnx/config.json')

model = ORTModelForSeq2SeqLM(
    config=config,
    onnx_paths=['oyto_t5_small_onnx/decoder_model.onnx','oyto_t5_small_onnx/encoder_model.onnx'],
    encoder_session=encoder_session, 
    decoder_session=decoder_session, 
    model_save_dir='oyto_t5_small_onnx',
    use_cache=False, 
)

onnx_translation = pipeline("translation_src_to_target", model=model, tokenizer=tokenizer)

text = 'the text to perform your translation task'
result = onnx_translation(text, max_length = 10000)
print(result)

Topic		Replies	Views
How to load a ORTModelForVision2Seq model? 🤗Transformers	0	386	May 12, 2023
Export M2M100 model to ONNX 🤗Transformers	13	3589	June 15, 2023
Transformers.onnx vs optimum.onnxruntime 🤗Optimum	1	1183	September 12, 2022
Optimize large scale transformer model inference with ONNX Runtime Models	0	392	January 18, 2022
Supporting ONNX optimized models 🤗Transformers	16	3096	September 1, 2021

How can I use the ONNX model?

Related topics