Accelerated gpt2-chinese-cluecorpussmall model

outstanding · September 17, 2021, 8:16am

hello！
I want to speed up the generation of gpt2 Chinese model！
Convert to onnx as required by the document.

1、Test according to document code

import onnxruntime as ort

from transformers import BertTokenizerFast
tokenizer = BertTokenizerFast.from_pretrained(“bert-base-cased”)

ort_session = ort.InferenceSession(“onnx/bert-base-cased/model.onnx”)

inputs = tokenizer(“Using BERT in ONNX!”, return_tensors=“np”)
outputs = ort_session.run([“last_hidden_state”, “pooler_output”], dict(inputs))

outputs how should I decode it？

After reading this case.

github.com

microsoft/onnxruntime/blob/main/onnxruntime/python/tools/transformers/notebooks/Inference_GPT2_with_OnnxRuntime_on_CPU.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Copyright (c) Microsoft Corporation. All rights reserved.  \n",
    "Licensed under the MIT License."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Inference PyTorch GPT2 Model with ONNX Runtime on CPU\n",
    "\n",
    "In this tutorial, you'll be introduced to how to load a GPT2 model from PyTorch, convert it to ONNX, and inference it using ONNX Runtime using IO Binding. Note that past state is used to get better performance."
   ]
  },
  {

This file has been truncated. show original

I really don’t understand. Forgive me for being a little white.

Hope to get help！

Topic		Replies	Views
Improving decoding speed by onnx conversion model Beginners	0	241	November 17, 2021
Using onnx for text-generation with GPT-2 🤗Transformers	4	4058	February 3, 2023
Convert bert tokenizer to onnx Beginners	3	3106	November 4, 2022
Gpt2 inference with onnx and quantize Beginners	6	3843	February 3, 2021
How load a Bert model from Onnx Runtime? 🤗Transformers	0	2272	July 14, 2021

Accelerated gpt2-chinese-cluecorpussmall model

Related topics