Inference with Finetuned BERT Model converted to ONNX does not output probabilities

lewtun · March 26, 2021, 8:37pm

Hey @heinz, there’s a notebook here that you can use to get started: transformers/04-onnx-export.ipynb at master · huggingface/transformers · GitHub

The main thing you need to do is create an ORT InferenceSession with e.g. the following function:

def create_model_for_provider(model_path: str, provider: str) -> InferenceSession: 
  
  assert provider in get_all_providers(), f"provider {provider} not found, {get_all_providers()}"

  # Few properties that might have an impact on performances (provided by MS)
  options = SessionOptions()
  options.intra_op_num_threads = 1
  options.graph_optimization_level = GraphOptimizationLevel.ORT_ENABLE_ALL

  # Load the model as a graph and prepare the CPU backend 
  session = InferenceSession(model_path, options, providers=[provider])
  session.disable_fallback()
    
  return session

Once you create a session, you’ll still need to tokenize and encode the inputs and you can find some additional examples in the ORT repo as well, e.g. onnxruntime/PyTorch_Bert-Squad_OnnxRuntime_CPU.ipynb at master · microsoft/onnxruntime · GitHub

Topic		Replies	Views
Onnx tf bert sentiment-analysis input and outputs 🤗Transformers	3	1097	July 5, 2022
How load a Bert model from Onnx Runtime? 🤗Transformers	0	2277	July 14, 2021
ONNX Conversion - transformers.onnx vs convert_graph_to_onnx.py Beginners	5	1636	January 26, 2023
How to get probabilities per label in finetuning classification task? Beginners	5	5484	February 18, 2022
Convert bert tokenizer to onnx Beginners	3	3163	November 4, 2022

Inference with Finetuned BERT Model converted to ONNX does not output probabilities

Related topics