Getting ValueError when exporting model to ONNX using optimum

AmoghM · August 16, 2022, 10:43pm

I’m trying to export my fine-tuned-bert-classifier torch model into ONNX format using optimum and then eventually want ro run it through a pipeline for sequence classification task.

Here is the code:

from optimum.onnxruntime import ORTModelForSequenceClassification, ORTOptimizer
from optimum.onnxruntime.configuration import OptimizationConfig
from optimum.pipelines import pipeline

# path_to_fine_tuned_model contains the path to the folder containing the pytorch_model.bin file
optimizer = ORTOptimizer.from_pretrained(path_to_fine_tuned_model, feature="sequence-classification") 
optimization_config = OptimizationConfig(optimization_level=2)

optimizer.export(
    onnx_model_path='../models/bert_model_opt.onnx',
    onnx_optimized_model_output_path='../models/bert_model_optimized.onnx',
    optimization_config=optimization_config,
)

However, this operation throws ValueError: Unable to generate dummy inputs for the model. Please provide a tokenizer or a preprocessor.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [35], in <cell line: 1>()
----> 1 optimizer.export(
      2     onnx_model_path='../models/bert_model_opt.onnx',
      3     onnx_optimized_model_output_path='../models/bert_model_optimized.onnx',
      4     optimization_config=optimization_config,
      5 )

File /opt/conda/envs/conda_ml/lib/python3.9/site-packages/optimum/onnxruntime/optimization.py:123, in ORTOptimizer.export(self, onnx_model_path, onnx_optimized_model_output_path, optimization_config, use_external_data_format)
    121 # Export the model if it has not already been exported to ONNX IR
    122 if not onnx_model_path.exists():
--> 123     export(self.preprocessor, self.model, self._onnx_config, self.opset, onnx_model_path)
    125 ORTConfigManager.check_supported_model_or_raise(self._model_type)
    126 num_heads = getattr(self.model.config, ORTConfigManager.get_num_heads_name(self._model_type))

File /opt/conda/envs/conda_ml/lib/python3.9/site-packages/transformers/onnx/convert.py:336, in export(preprocessor, model, config, opset, output, tokenizer, device)
    330         logger.warning(
    331             f"Unsupported PyTorch version for this model. Minimum required is {config.torch_onnx_minimum_version},"
    332             f" got: {torch_version}"
    333         )
    335 if is_torch_available() and issubclass(type(model), PreTrainedModel):
--> 336     return export_pytorch(preprocessor, model, config, opset, output, tokenizer=tokenizer, device=device)
    337 elif is_tf_available() and issubclass(type(model), TFPreTrainedModel):
    338     return export_tensorflow(preprocessor, model, config, opset, output, tokenizer=tokenizer)

File /opt/conda/envs/conda_ml/lib/python3.9/site-packages/transformers/onnx/convert.py:143, in export_pytorch(preprocessor, model, config, opset, output, tokenizer, device)
    139         setattr(model.config, override_config_key, override_config_value)
    141 # Ensure inputs match
    142 # TODO: Check when exporting QA we provide "is_pair=True"
--> 143 model_inputs = config.generate_dummy_inputs(preprocessor, framework=TensorType.PYTORCH)
    144 device = torch.device(device)
    145 if device.type == "cuda" and torch.cuda.is_available():

File /opt/conda/envs/conda_ml/lib/python3.9/site-packages/transformers/onnx/config.py:347, in OnnxConfig.generate_dummy_inputs(self, preprocessor, batch_size, seq_length, num_choices, is_pair, framework, num_channels, image_width, image_height, tokenizer)
    345     return dict(preprocessor(images=dummy_input, return_tensors=framework))
    346 else:
--> 347     raise ValueError(
    348         "Unable to generate dummy inputs for the model. Please provide a tokenizer or a preprocessor."
    349     )

ValueError: Unable to generate dummy inputs for the model. Please provide a tokenizer or a preprocessor.

Any idea how to fix this? Thanks!

regisss · August 17, 2022, 9:44am

Hi @AmoghM! Thanks for using Optimum.

According to the traceback you provided, no tokenizer or preprocessor was found. In the folder containing your pytorch_model.bin file, is there a JSON file that defines your tokenizer/preprocessor?

AmoghM · August 17, 2022, 9:45pm

Thanks! The tokenizer files weren’t there an upon submitting it didn’t give the ValueError but now it is throwing this error message:

2022-08-17 21:30:22.151083668 [W:onnxruntime:, inference_session.cc:1546 Initialize] Serializing optimized model with Graph Optimization level greater than ORT_ENABLE_EXTENDED and the NchwcTransformer enabled. The generated model may contain hardware specific optimizations, and should only be used in the same environment the model was optimized in.
symbolic shape infer failed. it's safe to ignore this message if there is no issue with optimized model
symbolic shape infer failed. it's safe to ignore this message if there is no issue with optimized model
symbolic shape infer failed. it's safe to ignore this message if there is no issue with optimized model
symbolic shape infer failed. it's safe to ignore this message if there is no issue with optimized model
symbolic shape infer failed. it's safe to ignore this message if there is no issue with optimized model
symbolic shape infer failed. it's safe to ignore this message if there is no issue with optimized model
symbolic shape infer failed. it's safe to ignore this message if there is no issue with optimized model
symbolic shape infer failed. it's safe to ignore this message if there is no issue with optimized model
symbolic shape infer failed. it's safe to ignore this message if there is no issue with optimized model
symbolic shape infer failed. it's safe to ignore this message if there is no issue with optimized model
symbolic shape infer failed. it's safe to ignore this message if there is no issue with optimized model
symbolic shape infer failed. it's safe to ignore this message if there is no issue with optimized model
symbolic shape infer failed. it's safe to ignore this message if there is no issue with optimized model
symbolic shape infer failed. it's safe to ignore this message if there is no issue with optimized model
symbolic shape infer failed. it's safe to ignore this message if there is no issue with optimized model
symbolic shape infer failed. it's safe to ignore this message if there is no issue with optimized model
symbolic shape infer failed. it's safe to ignore this message if there is no issue with optimized model
symbolic shape infer failed. it's safe to ignore this message if there is no issue with optimized model
symbolic shape infer failed. it's safe to ignore this message if there is no issue with optimized model
symbolic shape infer failed. it's safe to ignore this message if there is no issue with optimized model
symbolic shape infer failed. it's safe to ignore this message if there is no issue with optimized model
symbolic shape infer failed. it's safe to ignore this message if there is no issue with optimized model
symbolic shape infer failed. it's safe to ignore this message if there is no issue with optimized model
symbolic shape infer failed. it's safe to ignore this message if there is no issue with optimized model
symbolic shape infer failed. it's safe to ignore this message if there is no issue with optimized model
failed in shape inference <class 'AssertionError'>
failed in shape inference <class 'AssertionError'>

However, what is even more weird is that even after the AssertionError, the models were exported. And the latency on inference for initial onnx model, optimized onnx model and quantized onnx model is similar. I was expecting it to reduce.

regisss · August 19, 2022, 2:56pm

Which version of Optimum, ONNX and ONNXRuntime do you use?

AmoghM · August 19, 2022, 4:53pm

Here it is:

onnxruntime-gpu==1.12.1
optimum==1.3.0
onnx==1.12.0

regisss · October 16, 2022, 10:04am

Hi @AmoghM, is your model accessible on the Hugging Face Hub so that I try on my side?

mineshj1291 · November 14, 2022, 12:14pm

Hi any luck on this? I am also facing similar issue.

regisss · November 14, 2022, 2:05pm

Hi @mineshj1291! Could you share the optimization config you used please? Also if you can tell me the model type and the task I’ll try it on my side.

mineshj1291 · November 16, 2022, 9:14am

I am trying for GPT2 model for causal-lm

regisss · November 16, 2022, 9:52am

I don’t manage to reproduce the issue. Could you try this code snippet or tell me how yours differs from this one?

from optimum.onnxruntime import ORTOptimizer, ORTModelForCausalLM
from optimum.onnxruntime.configuration import OptimizationConfig

model_id = "gpt2"
save_dir = "/tmp/outputs"

model = ORTModelForCausalLM.from_pretrained(model_id, from_transformers=True)

optimizer = ORTOptimizer.from_pretrained(model)

optimization_config = OptimizationConfig(optimization_level=2)

optimizer.optimize(save_dir=save_dir, optimization_config=optimization_config)

I ran it with the following versions:

optimum==1.4.1
onnxruntime==1.13.1

mineshj1291 · November 17, 2022, 4:25am

Thanks, it worked. My optimum version was older.

Now the other issue I am facing is while using the optimized onnx.
It has about 1/3rd of the nodes running on CPU it is been a bottleneck for overall throughput.

This is how I am loading the model

model = ORTModelForCausalLM.from_pretrained(model_id=model_id, file_name="model_optimized.onnx", provider="CUDAExecutionProvider")
onnx_pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, device=0, max_length=653)
output = model_pipeline(text_inputs=in_text)

regisss · November 17, 2022, 1:49pm

Could you try after installing Optimum from source? You can do it as follows:

git clone https://github.com/huggingface/optimum.git
cd optimum/
pip install .

We recently added IOBinding in Optimum. A new version with this change will be released very soon but for now it is only available with a source install. You don’t have to change anything in your script, it should be used by default with the CUDAExecutionProvider.

mineshj1291 · November 18, 2022, 12:47pm

Thanks, It actually helped to improve the performance of onnx.

But it still has (15.6 secs ) a bit more (about 2 secs) latency compared to when I use the transformers model (13.5 secs per seq generation).
Also shows following warning:

 [W:onnxruntime:, session_state.cc:1030 VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
 [W:onnxruntime:, session_state.cc:1032 VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.

Jingya · November 19, 2022, 12:54am

Hey @mineshj1291,

When using transformers model, did you make use of past keys/values?

For decoder models, transformers model can pre-compute attention key/values as additional inputs for the next pass, which can avoid repeated computation and speed up sequential decoding.

Jingya · November 21, 2022, 7:54pm

Hi @mineshj1291, the reason why I asked is that the current ORTModelForCausalLM doesn’t have with_past support so it will recompute the attention for the past sequence during generation. If your PyTorch model makes use of the precomputation, it is not so fair to compare it with the current ORTModelForCausalLM which does extra computation.

Btw, if you are interested, you can follow the PR for adding with_past support to ORTModelForCausalLM. I will try to finish it this week.

Jingya · November 25, 2022, 11:20am

The PR for supporting use_cache + I/O binding support is merged. Here is a quick benchmark I’ve done with vanilla exported ONNX (no graph optimization nor quantization) running on a T4 GPU.

Please feel free to test it out, and tell us if the performance has been improved.

mineshj1291 · November 25, 2022, 12:56pm

Thanks for sharing. Will try it.

Topic		Replies	Views
Error exporting T5 model to ONNX with optimum-cli 🤗Optimum	3	927	May 7, 2024
Cannot export to ONNX with optimum.onnxruntime 🤗Optimum	0	957	February 28, 2024
Optimum export ONNX failure 🤗Optimum	0	710	September 30, 2023
Error while optimizing seq2seq model using optimum 🤗Optimum	1	74	September 16, 2024
Onnx export functionality failure for facebook/opt-2.7b with optimum CLI 🤗Transformers	0	342	October 11, 2023

Getting ValueError when exporting model to ONNX using optimum

Related topics