I am trying to optimize a Seq2Seq
model for summarization, this guide has been very useful for quantizing it, but the results for the quantized model aren’t accurate, so i want to optimize the model and compare the results. I want to know how i can optimize a Seq2seq model, I know that i will have to optimize each the encoder
, decoder
, and decoder_with_past
, but i don’t know how, this is what i have so far:
This is the code i’m trying for optimizing each part, but isn’t working:
import re
import torch
from transformers import AutoConfig, AutoModelForSeq2SeqLM, AutoTokenizer
import time
from optimum.onnxruntime import ORTQuantizer, ORTModelForSeq2SeqLM, ORTOptimizer
from optimum.onnxruntime.configuration import AutoQuantizationConfig, OptimizationConfig
from pathlib import Path
# load Seq2Seq model and set model file directory
model_id = "facebook/bart-large-cnn"
optimization_config = OptimizationConfig(optimization_level=99)
onnx_path = '/'
# Create encoder optimizer
encoder_optimizer = ORTOptimizer.from_pretrained(model_name_or_path="/testing/encoder_model.onnx", feature='seq2seq-lm')
encoder_optimizer.export(optimization_config=optimization_config,
onnx_model_path=onnx_path / "encoder.onnx",
onnx_optimized_model_output_path=onnx_path / "encoder_optimized.onnx")
# Create decoder optimizer
decoder_optimizer = ORTOptimizer.from_pretrained(model_name_or_path="/testing/decoder_model.onnx", feature='seq2seq-lm')
decoder_optimizer.export(optimization_config=optimization_config,
onnx_model_path=onnx_path / "decoder.onnx",
onnx_optimized_model_output_path=onnx_path / "decoder_optimized.onnx")
# Create decoder with past key values optimizer
decoder_wp_optimizer = ORTOptimizer.from_pretrained(model_name_or_path="/testing/decoder_with_past_model.onnx", feature='seq2seq-lm')
decoder_wp_optimizer.export(optimization_config=optimization_config,
onnx_model_path=onnx_path / "decoder_wp.onnx",
onnx_optimized_model_output_path=onnx_path / "decoder_wp_optimized.onnx")
OUTPUT:
ValueError: Connection error, and we cannot find the requested files in the cached path. Please try again or make sure your Internet connection is on.
Any help would be greatly appreciated.