Paligemma2 onnx export KeyError: "Unknown task: image-text-to-text

biswajitdevsarma · February 8, 2025, 4:29pm

Hi ,
I tried to convert the paligemma2 3B parameter model with 224 image resolution to onnx using optimum and got this error:

$optimum-cli export onnx --model google/paligemma-3b-pt-224 paligemma-3b-pt-224_onnx/

KeyError: “Unknown task: image-text-to-text. Possible values are: audio-classification for AutoModelForAudioClassification, audio-frame-classification for AutoModelForAudioFrameClassification, audio-xvector for AutoModelForAudioXVector, automatic-speech-recognition for (‘AutoModelForSpeechSeq2Seq’, ‘AutoModelForCTC’), depth-estimation for AutoModelForDepthEstimation, feature-extraction for AutoModel, fill-mask for AutoModelForMaskedLM, image-classification for AutoModelForImageClassification, image-segmentation for (‘AutoModelForImageSegmentation’, ‘AutoModelForSemanticSegmentation’, ‘AutoModelForInstanceSegmentation’, ‘AutoModelForUniversalSegmentation’), image-to-image for AutoModelForImageToImage, image-to-text for (‘AutoModelForVision2Seq’, ‘AutoModel’), mask-generation for AutoModel, masked-im for AutoModelForMaskedImageModeling, multiple-choice for AutoModelForMultipleChoice, object-detection for AutoModelForObjectDetection, question-answering for AutoModelForQuestionAnswering, reinforcement-learning for AutoModel, semantic-segmentation for AutoModelForSemanticSegmentation, text-to-audio for (‘AutoModelForTextToSpectrogram’, ‘AutoModelForTextToWaveform’), text-generation for AutoModelForCausalLM, text2text-generation for AutoModelForSeq2SeqLM, text-classification for AutoModelForSequenceClassification, token-classification for AutoModelForTokenClassification, visual-question-answering for AutoModelForVisualQuestionAnswering, zero-shot-image-classification for AutoModelForZeroShotImageClassification, zero-shot-object-detection for AutoModelForZeroShotObjectDetection”

Please help if you have any solution. Is “image-text-to-text” task is available in optimum? If yes, how to use it?
Or is there any alternative method to convert the model to onnx?

John6666 · February 9, 2025, 4:36am

It seems that this can be avoided by explicitly specifying a task (to one of the supported tasks).

github.com/huggingface/optimum

OPTIMUM Onnx Exporter for openai/clip-vit-large-patch14 model

opened 10:24AM - 12 Jul 24 UTC

antje2233

onnx

### Feature request I wonder if the task text-classification can to be supporte…d in the ONNX export for clip? Ich want to use the openai/clip-vit-large-path14 model for zero-shot image classification (classification of images without pretraining based on given candidate labels) but I get the following error: ValueError Traceback (most recent call last) File /home/danne00a/ZablageBlazeG/ZeroShotClassification/zeroshotclassifier.py:2 [1](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/zeroshotclassifier.py:1) #%% ----> [2](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/zeroshotclassifier.py:2) ort_model = ORTModelForSequenceClassification.from_pretrained(model_checkpoint, export=True) File ~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:669, in ORTModel.from_pretrained(cls, model_id, export, force_download, use_auth_token, cache_dir, subfolder, config, local_files_only, provider, session_options, provider_options, use_io_binding, **kwargs) [620](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:620) @classmethod [621](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:621) @add_start_docstrings(FROM_PRETRAINED_START_DOCSTRING) [622](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:622) def from_pretrained( (...) [636](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:636) **kwargs, [637](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:637) ): [638](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:638) """ [639](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:639) provider (`str`, defaults to `"CPUExecutionProvider"`): [640](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:640) ONNX Runtime provider to use for loading the model. See https://onnxruntime.ai/docs/execution-providers/ for (...) [667](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:667) `ORTModel`: The loaded ORTModel model. [668](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:668) """ --> [669](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:669) return super().from_pretrained( [670](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:670) model_id, [671](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:671) export=export, [672](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:672) force_download=force_download, [673](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:673) use_auth_token=use_auth_token, [674](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/onnxruntime/modeling_ort.py:674) cache_dir=cache_dir, ... [274](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/exporters/onnx/__main__.py:274) ) [276](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/exporters/onnx/__main__.py:276) # TODO: Fix in Transformers so that SdpaAttention class can be exported to ONNX. `attn_implementation` is introduced in Transformers 4.36. [277](https://vscode-remote+ssh-002dremote-002bdemlhr7sw119x.vscode-resource.vscode-cdn.net/home/danne00a/ZablageBlazeG/ZeroShotClassification/~/mambaforge/envs/ZeroShot_Mamba_env/lib/python3.11/site-packages/optimum/exporters/onnx/__main__.py:277) if model_type in SDPA_ARCHS_ONNX_EXPORT_NOT_SUPPORTED and _transformers_version >= version.parse("4.35.99"): ValueError: Asked to export a clip model for the task text-classification, but the Optimum ONNX exporter only supports the tasks feature-extraction, zero-shot-image-classification for clip. Please use a supported task. Please open an issue at https://github.com/huggingface/optimum/issues if you would like the task text-classification to be supported in the ONNX export for clip. ### Motivation I'm struggling with the sioze of the openai/clip-vit-large-patch14 model, thus I want to convert it to OPTIMUM onnx! ### Your contribution no ideas so far..

biswajitdevsarma · February 10, 2025, 3:29am

I tried specifying one of the existing task image-to-text. But that throws another error

$optimum-cli export onnx --model google/paligemma-3b-pt-224 --task image-to-text paligemma-3b-pt-224_onnx/

ValueError: Trying to export a paligemma model, that is a custom or unsupported architecture, but no custom onnx configuration was passed as custom_onnx_configs. Please refer to Export a model to ONNX with optimum.exporters.onnx for an example on how to export custom models. Please open an issue at GitHub · Where software is built if you would like the model type paligemma to be supported natively in the ONNX export.

John6666 · February 10, 2025, 5:45am

Of course, some of the newer models are not supported, but I found a converted version of Paligemma2. Maybe the github version of ONNX supports it.

The best way to find out is to ask the ONNX Community, who distribute it…

biswajitdevsarma · February 11, 2025, 4:19pm

@John6666 Thanks

Topic		Replies	Views
Exporting model wav2vec2 not supported? 🤗Optimum	3	1235	August 10, 2023
Converting AlignTTS (text-to-speech) model to ONNX Intermediate	0	595	April 18, 2023
Image-To-Text task on Inference Endpoint Inference Endpoints on the Hub	13	2329	October 17, 2023
Getting ValueError when exporting model to ONNX using optimum 🤗Optimum	16	5050	November 25, 2022
Error while optimizing seq2seq model using optimum 🤗Optimum	1	60	September 16, 2024

Paligemma2 onnx export KeyError: "Unknown task: image-text-to-text

Related topics