Hi ,
I tried to convert the paligemma2 3B parameter model with 224 image resolution to onnx using optimum and got this error:
$optimum-cli export onnx --model google/paligemma-3b-pt-224 paligemma-3b-pt-224_onnx/
KeyError: âUnknown task: image-text-to-text. Possible values are: audio-classification
for AutoModelForAudioClassification, audio-frame-classification
for AutoModelForAudioFrameClassification, audio-xvector
for AutoModelForAudioXVector, automatic-speech-recognition
for (âAutoModelForSpeechSeq2Seqâ, âAutoModelForCTCâ), depth-estimation
for AutoModelForDepthEstimation, feature-extraction
for AutoModel, fill-mask
for AutoModelForMaskedLM, image-classification
for AutoModelForImageClassification, image-segmentation
for (âAutoModelForImageSegmentationâ, âAutoModelForSemanticSegmentationâ, âAutoModelForInstanceSegmentationâ, âAutoModelForUniversalSegmentationâ), image-to-image
for AutoModelForImageToImage, image-to-text
for (âAutoModelForVision2Seqâ, âAutoModelâ), mask-generation
for AutoModel, masked-im
for AutoModelForMaskedImageModeling, multiple-choice
for AutoModelForMultipleChoice, object-detection
for AutoModelForObjectDetection, question-answering
for AutoModelForQuestionAnswering, reinforcement-learning
for AutoModel, semantic-segmentation
for AutoModelForSemanticSegmentation, text-to-audio
for (âAutoModelForTextToSpectrogramâ, âAutoModelForTextToWaveformâ), text-generation
for AutoModelForCausalLM, text2text-generation
for AutoModelForSeq2SeqLM, text-classification
for AutoModelForSequenceClassification, token-classification
for AutoModelForTokenClassification, visual-question-answering
for AutoModelForVisualQuestionAnswering, zero-shot-image-classification
for AutoModelForZeroShotImageClassification, zero-shot-object-detection
for AutoModelForZeroShotObjectDetectionâ
Please help if you have any solution. Is âimage-text-to-textâ task is available in optimum? If yes, how to use it?
Or is there any alternative method to convert the model to onnx?
1 Like
It seems that this can be avoided by explicitly specifying a task (to one of the supported tasks).
opened 10:24AM - 12 Jul 24 UTC
onnx
### Feature request
I wonder if the task text-classification can to be supporte⌠d in the ONNX export for clip? Ich want to use the openai/clip-vit-large-path14 model for zero-shot image classification (classification of images without pretraining based on given candidate labels) but I get the following error:
ValueError Traceback (most recent call last)
File /home/danne00a/ZablageBlazeG/ZeroShotClassification/zeroshotclassifier.py:2
[1](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/zeroshotclassifier.py:1) #%%
----> [2](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/zeroshotclassifier.py:2) ort_model = ORTModelForSequenceClassification.from_pretrained(model_checkpoint, export=True)
File ~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:669, in ORTModel.from_pretrained(cls, model_id, export, force_download, use_auth_token, cache_dir, subfolder, config, local_files_only, provider, session_options, provider_options, use_io_binding, **kwargs)
[620](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:620) @classmethod
[621](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:621) @add_start_docstrings(FROM_PRETRAINED_START_DOCSTRING)
[622](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:622) def from_pretrained(
(...)
[636](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:636) **kwargs,
[637](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:637) ):
[638](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:638) """
[639](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:639) provider (`str`, defaults to `"CPUExecutionProvider"`):
[640](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:640) ONNX Runtime provider to use for loading the model. See https://onnxruntime.ai/docs/execution-providers/ for
(...)
[667](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:667) `ORTModel`: The loaded ORTModel model.
[668](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:668) """
--> [669](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:669) return super().from_pretrained(
[670](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:670) model_id,
[671](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:671) export=export,
[672](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:672) force_download=force_download,
[673](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:673) use_auth_token=use_auth_token,
[674](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:674) cache_dir=cache_dir,
...
[274](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/exporters/onnx/__main__.py:274) )
[276](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/exporters/onnx/__main__.py:276) # TODO: Fix in Transformers so that SdpaAttention class can be exported to ONNX. `attn_implementation` is introduced in Transformers 4.36.
[277](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/exporters/onnx/__main__.py:277) if model_type in SDPA_ARCHS_ONNX_EXPORT_NOT_SUPPORTED and _transformers_version >= version.parse("4.35.99"):
ValueError: Asked to export a clip model for the task text-classification, but the Optimum ONNX exporter only supports the tasks feature-extraction, zero-shot-image-classification for clip. Please use a supported task. Please open an issue at https://github.com/huggingface/optimum/issues if you would like the task text-classification to be supported in the ONNX export for clip.
### Motivation
I'm struggling with the sioze of the openai/clip-vit-large-patch14 model, thus I want to convert it to OPTIMUM onnx!
### Your contribution
no ideas so far..
1 Like
I tried specifying one of the existing task image-to-text. But that throws another error
$optimum-cli export onnx --model google/paligemma-3b-pt-224 --task image-to-text paligemma-3b-pt-224_onnx/
ValueError: Trying to export a paligemma model, that is a custom or unsupported architecture, but no custom onnx configuration was passed as custom_onnx_configs
. Please refer to Export a model to ONNX with optimum.exporters.onnx for an example on how to export custom models. Please open an issue at GitHub ¡ Where software is built if you would like the model type paligemma to be supported natively in the ONNX export.
1 Like
Of course, some of the newer models are not supported, but I found a converted version of Paligemma2. Maybe the github version of ONNX supports it.
The best way to find out is to ask the ONNX Community, who distribute itâŚ
1 Like