Hi all, Iâm trying to convert model nguyenvulebinh/wav2vec2-base-vietnamese-250h to onnx
which is a speech to text wav2tovec2 something.
from pathlib import Path
import transformers
from transformers.onnx import FeaturesManager
from transformers import AutoConfig, AutoTokenizer, AutoModelForAudioClassification
load model and tokenizer
model_id = ânguyenvulebinh/wav2vec2-base-vietnamese-250hâ
feature = âaudio-classificationâ
model = AutoModelForAudioClassification.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)
load config
model_kind, model_onnx_config = FeaturesManager.check_supported_model_or_raise(model, feature=feature)
onnx_config = model_onnx_config(model.config)
export
onnx_inputs, onnx_outputs = transformers.onnx.export(
preprocessor=tokenizer,
model=model,
config=onnx_config,
opset=13,
output=Path(âmodel.onnxâ)
)
and i got error:
Exception has occurred: KeyError
âwav2vec2 is not supported yet. Only [âalbertâ, âbartâ, âbeitâ, âbertâ, âbig-birdâ, âbigbird-pegasusâ, âblenderbotâ, âblenderbot-smallâ, âbloomâ, âcamembertâ, âclipâ, âcodegenâ, âconvbertâ, âconvnextâ, âdata2vec-textâ, âdata2vec-visionâ, âdebertaâ, âdeberta-v2â, âdeitâ, âdetrâ, âdistilbertâ, âelectraâ, âflaubertâ, âgpt2â, âgptjâ, âgpt-neoâ, âgroupvitâ, âibertâ, âimagegptâ, âlayoutlmâ, âlayoutlmv3â, âlevitâ, âlongt5â, âlongformerâ, âmarianâ, âmbartâ, âmobilebertâ, âmobilenet-v1â, âmobilenet-v2â, âmobilevitâ, âmt5â, âm2m-100â, âowlvitâ, âperceiverâ, âpoolformerâ, ârembertâ, âresnetâ, ârobertaâ, âroformerâ, âsegformerâ, âsqueezebertâ, âswinâ, ât5â, âvision-encoder-decoderâ, âvitâ, âwhisperâ, âxlmâ, âxlm-robertaâ, âyolosâ] are supported. If you want to support wav2vec2 please propose a PR or open up an issue.â
seems like its not yet supported. is there a way to request it?
I also tried to generate onnx model using cli
optimum-cli export onnx --model nguyenvulebinh/wav2vec2-base-vietnamese-250h onnxOptimum/
Framework not specified. Using pt to export to ONNX.
/home/ace/.local/lib/python3.10/site-packages/transformers/configuration_utils.py:380: UserWarning: Passing gradient_checkpointing
to a config initialization is deprecated and will be removed in v5 Transformers. Using model.gradient_checkpointing_enable()
instead, or if you are using the Trainer
API, pass gradient_checkpointing=True
in your TrainingArguments
.
warnings.warn(
Automatic task detection to automatic-speech-recognition (possible synonyms are: audio-ctc, speech2seq-lm).
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Using framework PyTorch: 2.0.1+cu117
/home/ace/.local/lib/python3.10/site-packages/transformers/models/wav2vec2/modeling_wav2vec2.py:595: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We canât record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if attn_weights.size() != (bsz * self.num_heads, tgt_len, src_len):
/home/ace/.local/lib/python3.10/site-packages/transformers/models/wav2vec2/modeling_wav2vec2.py:634: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We canât record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if attn_output.size() != (bsz * self.num_heads, tgt_len, self.head_dim):
============= Diagnostic Run torch.onnx.export version 2.0.1+cu117 =============
verbose: False, log level: Level.ERROR
======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ========================
Post-processing the exported modelsâŚ
Validating ONNX model onnxOptimum/model.onnxâŚ
-[â] ONNX model output names match reference model (logits)
- Validating ONNX Model output âlogitsâ:
-[â] (2, 49, 110) matches (2, 49, 110)
- values not close enough, max diff: 4.9054622650146484e-05 (atol: 1e-05)
The ONNX export succeeded with the warning: The maximum absolute difference between the output of the reference model and the ONNX exported model is not within the set tolerance 1e-05:
- logits: max diff = 4.9054622650146484e-05.
The exported model was saved at: onnxOptimum
- values not close enough, max diff: 4.9054622650146484e-05 (atol: 1e-05) â would it mean this generated model would also not work ? (it was an X not a tick)