Using distributed or parallel set-up in script?: No
Information
Model I am using (Bert, XLNet …): Bert
The problem arises when using:
my own modified scripts: (give details below)
The tasks I am working on is:
my own task or dataset: (give details below)
To reproduce
Steps to reproduce the behavior:
Trained HuggingFace Transformers model BertForSequenceClassification on custom dataset with PyTorch backend.
Used provided convert_graph_to_onnx.py script to convert model (from saved checkpoint) to ONNX format.
Loaded the model with ONNXRuntime
Instantiated BertTokenizer.from_pretrained(‘bert-based-uncased’) and fed in various input text to encode_plus method.
Fed outputs of this to the ONNXRuntime session.
Expected behavior
The expected behavior is that the output of sess.run on the aforementioned inputs should output an array of dimension (1, 100) (corresponding to 100 classes) with each value between 0 and 1, with all entries summing to 1. We get the correct dimension, however, we get values between about -3.04 and 7.14 (unsure what these values refer to).
Hi @nsingh, without seeing your code it’s hard to know exactly what’s going wrong but based on this comment
We get the correct dimension, however, we get values between about -3.04 and 7.14 (unsure what these values refer to).
my guess is that you are getting the logits from the model instead of the predicted classes. I ran into this problem recently and the solution was to specify pipeline_name=sentiment-analysis to load the model for a TextClassificationPipeline:
Hi @lewtun, I’m new to ONNX and having difficulties moving my pipeline into ONNXRuntime. Currently my workflow is like this: config = AutoConfig.from_pretrained(path_finetuned) model = AutoModelForSequenceClassification.from_pretrained(path_finetuned, config=config) tokenizer = AutoTokenizer.from_pretrained("distilbert-base-cased") classifier = TextClassificationPipeline(model=model, tokenizer=tokenizer)
I am able to convert my the model to ONNX by: convert(framework="pt", model="distilbert-base-cased", output=Path("/onnx/fine-tuned.onnx"), pipeline_name="sentiment-analysis", opset=13)
However, I’m having difficulties moving the ONNX graph towards a similar pipeline classifier = TextClassificationPipeline(...). Could you share your approach?
The main thing you need to do is create an ORT InferenceSession with e.g. the following function:
def create_model_for_provider(model_path: str, provider: str) -> InferenceSession:
assert provider in get_all_providers(), f"provider {provider} not found, {get_all_providers()}"
# Few properties that might have an impact on performances (provided by MS)
options = SessionOptions()
options.intra_op_num_threads = 1
options.graph_optimization_level = GraphOptimizationLevel.ORT_ENABLE_ALL
# Load the model as a graph and prepare the CPU backend
session = InferenceSession(model_path, options, providers=[provider])
session.disable_fallback()
return session