hello!
I want to speed up the generation of gpt2 Chinese model!
Convert to onnx as required by the document.
1、Test according to document code
import onnxruntime as ort
from transformers import BertTokenizerFast
tokenizer = BertTokenizerFast.from_pretrained(“bert-base-cased”)
ort_session = ort.InferenceSession(“onnx/bert-base-cased/model.onnx”)
inputs = tokenizer(“Using BERT in ONNX!”, return_tensors=“np”)
outputs = ort_session.run([“last_hidden_state”, “pooler_output”], dict(inputs))
outputs how should I decode it?
After reading this case.
I really don’t understand. Forgive me for being a little white.
Hope to get help!