Different results from HF and ONNX

maylorian · March 6, 2023, 6:30pm

Hi there!

Converting a model to ONNX using python -m transformers.onnx --model=dslim/bert-large-NER onnx and loading the model up in Java using the onnxruntime library gives off different results compared to when run using HF.

I’ve found this issue [Bug] Attention and QAttention don't work properly in some cases · Issue #14363 · microsoft/onnxruntime · GitHub and just wanted to check whether there’s a solution to this that I could use.

Thanks!

Topic		Replies	Views
Error while converting hf model to onnx 🤗Transformers	2	280	September 20, 2021
Looking for help converting transformers to ONNX with HF Optimum 🤗Transformers	0	277	November 9, 2023
Improving decoding speed by onnx conversion model Beginners	0	241	November 17, 2021
ONNX conversion 🤗Transformers	0	285	July 8, 2021
Transformers.onnx vs optimum.onnxruntime 🤗Optimum	1	1132	September 12, 2022

Different results from HF and ONNX

Related topics