Convert DeBERTa model to ONNX with mixed precision

SergeyShk · January 6, 2023, 12:58pm

Hello. I’m using deberta-v3-base for a text classification task. After training I’m converting a pytorch model to ONNX format like this:

    model_kind, model_onnx_config = FeaturesManager.check_supported_model_or_raise(
        model, feature="sequence-classification"
    )
    onnx_config = model_onnx_config(config)
    onnx_inputs, onnx_outputs = export(
        preprocessor=tokenizer,
        model=model,
        config=onnx_config,
        opset=15,
        output=Path(args.output_model) / "onnx_model.onnx",
    )

Everything works like a charm except that the size of the model is twice the size of the original DeBERTa - ~750MB. Because of it I want to convert it with mixed precision, i.e. fp16. I tried two approaches:

Run model.half() before ONNX conversion
Use the following code:

from onnxruntime.transformers import optimizer
optimized_model = optimizer.optimize_model("onnx_model.onnx", model_type='bert', num_heads=12, hidden_size=768, use_gpu=False, opt_level=0)
optimized_model.convert_float_to_float16()
optimized_model.save_model_to_file("onnx_model_fp16.onnx")

But in both cases I get this error during inference on CPU:

2023-01-06 10:46:46.332352649 [W:onnxruntime:, constant_folding.cc:179 ApplyImpl] Could not find a CPU kernel and hence can't constant fold LayerNormalization node 'LayerNorm_1'
2023-01-06 10:46:46.414666254 [W:onnxruntime:, constant_folding.cc:179 ApplyImpl] Could not find a CPU kernel and hence can't constant fold LayerNormalization node 'LayerNorm_1'
2023-01-06 10:46:46.425605272 [W:onnxruntime:, constant_folding.cc:179 ApplyImpl] Could not find a CPU kernel and hence can't constant fold LayerNormalization node 'LayerNorm_1'

I also tried to set use_gpu=True in optimize_model method. Errors disappeared, but the inference time was 3-4 time slower.

Topic		Replies	Views
DeBERTaV3 ONNX conversion error Intermediate	2	2037	July 25, 2022
DeBERTa V2 ONNX with pipeline does not work Beginners	5	768	February 27, 2024
DistilHubert: PyTorch to ONNX conversion issue 🤗Transformers	3	733	February 24, 2022
Custom model export to onnx-runtime 🤗Optimum	7	2465	May 9, 2023
Converting AlignTTS (text-to-speech) model to ONNX Intermediate	0	595	April 18, 2023

Convert DeBERTa model to ONNX with mixed precision

Related topics