Improving decoding speed by onnx conversion model

outstanding · November 17, 2021, 7:52am

After I use the uer py training model to convert the transformers model, the decoding speed is not very fast. I want to improve the speed by converting the model onnx.

python -m transformers.onnx --model=bert-base-cased onnx/bert-base-cased/

Some problems were encountered using the converted model.
According to the official documentation, there is no generation method.

I checked the case list of gpt2 conversion onnx. The gpt2 model can be decoded normally and the speed has been improved.

github.com

microsoft/onnxruntime/blob/main/onnxruntime/python/tools/transformers/notebooks/Inference_GPT2_with_OnnxRuntime_on_CPU.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Copyright (c) Microsoft Corporation. All rights reserved.  \n",
    "Licensed under the MIT License."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Inference PyTorch GPT2 Model with ONNX Runtime on CPU\n",
    "\n",
    "In this tutorial, you'll be introduced to how to load a GPT2 model from PyTorch, convert it to ONNX, and inference it using ONNX Runtime using IO Binding. Note that past state is used to get better performance."
   ]
  },
  {

This file has been truncated. show original

An exception occurred after replacing with the converted Chinese model.

Hope to get help here, thank you!

Topic		Replies	Views
Accelerated gpt2-chinese-cluecorpussmall model Beginners	0	408	September 17, 2021
Using GPT-Neo-125M with ONNX Intermediate	3	1349	July 5, 2022
Convert_graph_to_onnx doesn't meet UnicodeDecodeError 🤗Transformers	0	258	October 21, 2021
How can I use the ONNX model? 🤗Transformers	2	1593	January 29, 2024
How to convert mT5 and ByT5 to ONNX format? 🤗Transformers	4	2065	December 22, 2021

Improving decoding speed by onnx conversion model

Related topics