Fail: [ONNXRuntimeError] : 1 : FAIL : Deserialize tensor onnx:

serdarcaglar · November 30, 2022, 7:57am

Hi,
I try to export whisper-large model to ONNX. But I faced belowed error.

Fail: [ONNXRuntimeError] : 1 : FAIL : Deserialize tensor onnx::MatMul_8599 failed.tensorprotoutils.cc:637 TensorProtoToTensor External initializer: onnx::MatMul_8599 offset: 0 size to read: 26214400 given file_length: 6553600 are out of bounds or can not be read in full.

Before I faced error, I got belowed warning message:

  if attn_weights.size() != (bsz * self.num_heads, tgt_len, src_len):TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attn_output.size() != (bsz * self.num_heads, tgt_len, self.head_dim):

How can I convert openai/whisper-large model to ONNX format.

fxmarty · December 1, 2022, 5:04pm

Hi @serdarcaglar , thank you for the report! Could you provide a reproducible command / code to make it easier for us to track the issue?

The ONNX export through transformers.onnx will soon rely fully on Optimum Exporters (package for all things export).

Currently, using the stable optimum==1.5.1, the export command python -m optimum.exporters.onnx --model openai/whisper-tiny whisper_tiny_onnx_vanilla works well.

In the next release of Optimum (that you can hopefully expect sometime next week), the exporter will support exporting the encoder and decoder as two separate files, making it easier to use with ONNX Runtime:

python -m optimum.exporters.onnx --model openai/whisper-tiny --for-ort whisper_tiny_onnx

This will allow you to export your model, and load it directly from a local folder into ORTModelForSpeechSeq2Seq.

Compare:

serdarcaglar · December 6, 2022, 6:17pm

The code I used for exporting:

 from datasets import load_dataset
 from transformers import AutoProcessor, pipeline
 from optimum.onnxruntime import ORTModelForSpeechSeq2Seq

 processor = AutoProcessor.from_pretrained("openai/whisper-large")
 model = ORTModelForSpeechSeq2Seq.from_pretrained("openai/whisper-large", from_transformers=True)
 speech_recognition_pipeline = pipeline(
    "automatic-speech-recognition",
     model=model,
     feature_extractor=processor.feature_extractor,
     tokenizer=processor.tokenizer,
  )

Warning Messages:

/home/joseph/miniconda3/envs/ort-deploy/lib/python3.9/site-packages/transformers/models/whisper/modeling_whisper.py:200: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attn_weights.size() != (bsz * self.num_heads, tgt_len, src_len):
/home/joseph/miniconda3/envs/ort-deploy/lib/python3.9/site-packages/transformers/models/whisper/modeling_whisper.py:239: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attn_output.size() != (bsz * self.num_heads, tgt_len, self.head_dim):
/home/joseph/miniconda3/envs/ort-deploy/lib/python3.9/site-packages/transformers/models/whisper/modeling_whisper.py:750: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if input_shape[-1] > 1:
/home/joseph/miniconda3/envs/ort-deploy/lib/python3.9/site-packages/transformers/models/whisper/modeling_whisper.py:74: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  mask = torch.full((tgt_len, tgt_len), torch.tensor(torch.finfo(dtype).min))
/home/joseph/miniconda3/envs/ort-deploy/lib/python3.9/site-packages/transformers/models/whisper/modeling_whisper.py:207: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attention_mask.size() != (bsz, 1, tgt_len, src_len):
/home/joseph/miniconda3/envs/ort-deploy/lib/python3.9/site-packages/transformers/models/whisper/modeling_whisper.py:79: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if past_key_values_length > 0:

ERROR:

Fail: [ONNXRuntimeError] : 1 : FAIL : Deserialize tensor onnx::MatMul_9737 failed.tensorprotoutils.cc:637 TensorProtoToTensor External initializer: onnx::MatMul_9737 offset: 0 size to read: 26214400 given file_length: 6553600 are out of bounds or can not be read in full.

mohitsha · December 7, 2022, 6:19am

Hi @serdarcaglar , currently this is an issue with models with external data format. We have a PR [255] open for the issue and the fix should be available soon . In the mean time you could run the above model by disabling the cache.

from datasets import load_dataset
from transformers import AutoProcessor, pipeline
from optimum.onnxruntime import ORTModelForSpeechSeq2Seq

 processor = AutoProcessor.from_pretrained("openai/whisper-large")
 model = ORTModelForSpeechSeq2Seq.from_pretrained("openai/whisper-large", from_transformers=True, use_cache=False)
 speech_recognition_pipeline = pipeline(
    "automatic-speech-recognition",
     model=model,
     feature_extractor=processor.feature_extractor,
     tokenizer=processor.tokenizer,
  )

serdarcaglar · December 7, 2022, 1:32pm

Thanks a lot.

Topic		Replies	Views
Cannot export to ONNX with optimum.onnxruntime 🤗Optimum	0	919	February 28, 2024
Openai/whisper-large-v3 ONNX validation Models	2	1153	December 30, 2023
Error exporting T5 model to ONNX with optimum-cli 🤗Optimum	3	823	May 7, 2024
Getting ValueError when exporting model to ONNX using optimum 🤗Optimum	16	5073	November 25, 2022
Onnx export functionality failure for facebook/opt-2.7b with optimum CLI 🤗Transformers	0	336	October 11, 2023

Fail: [ONNXRuntimeError] : 1 : FAIL : Deserialize tensor onnx:

Related topics