Accelerated gpt2-chinese-cluecorpussmall model

hello!
I want to speed up the generation of gpt2 Chinese model!
Convert to onnx as required by the document.

1、Test according to document code

import onnxruntime as ort

from transformers import BertTokenizerFast
tokenizer = BertTokenizerFast.from_pretrained(“bert-base-cased”)

ort_session = ort.InferenceSession(“onnx/bert-base-cased/model.onnx”)

inputs = tokenizer(“Using BERT in ONNX!”, return_tensors=“np”)
outputs = ort_session.run([“last_hidden_state”, “pooler_output”], dict(inputs))

outputs how should I decode it?

After reading this case.

I really don’t understand. Forgive me for being a little white.

Hope to get help!