Troubles with converting CLIP to ONNX

danissimo · October 20, 2022, 1:59pm

Hello everyone! I would like to convert the CLIP model to ONNX format. I read in the documentation how to do it, and this is what happened:

My libs:
torch - 1.12.1
transformers - 4.23.1
onnxruntime - 1.11.1
onnxsimplifier - 0.4.8

Code:

import time
import PIL
import torch
import onnx
import onnxruntime as ort
from onnxsim import simplify
import transformers
import transformers.onnx
from transformers import CLIPModel, CLIPProcessor
import requests

import warnings
warnings.filterwarnings('ignore')


# Load processor from hub, but weights i have locally.
pt_model = CLIPModel.from_pretrained('./models/clip_model/category_clip_model', local_files_only=True)
processor = CLIPProcessor.from_pretrained('openai/clip-vit-base-patch32')

# Save to disk
processor.save_pretrained("local-pt-checkpoint")
pt_model.save_pretrained("local-pt-checkpoint")

# then i converting to onnx with transformers.onnx
!python -m transformers.onnx --model=local-pt-checkpoint onnx/

import requests
session = ort.InferenceSession("onnx/model.onnx")
image = Image.open(requests.get("http://images.cocodataset.org/val2017/000000039769.jpg", stream=True).raw)
inputs = processor(text=["a photo of a cat"], images=image, return_tensors="np", padding=True)
outputs = session.run(output_names=["last_hidden_state"], input_feed=dict(inputs))
# then i got an error (screenshot 2)

Also after converting(with transformers.onnx) I got that kind of output. It seems strange because of 3 input channels instead of two (image and text)

nielsr · October 21, 2022, 7:31am

Could you include the command you used to convert to ONNX?

mineshj1291 · October 21, 2022, 11:14am

you can try:

inputs = processor(text=["a photo of two cat"], images=image, return_tensors="np", padding=True)
session.run(input_feed=dict(inputs), output_names=["logits_per_image"])

which returns logits_per_image for each of the text inputs

mineshj1291 · October 21, 2022, 11:15am

are you looking for this:

!python -m transformers.onnx --model=local-pt-checkpoint onnx/

Topic		Replies	Views
How to successfully ONNX pretrained models Beginners	7	235	January 26, 2025
Converting CLIP to CoreML 🤗Transformers	13	3196	December 12, 2023
Converting AlignTTS (text-to-speech) model to ONNX Intermediate	0	594	April 18, 2023
Looking for help converting transformers to ONNX with HF Optimum 🤗Transformers	0	277	November 9, 2023
Error converting cmpk to onnx format after fine tuning Beginners	0	543	August 28, 2022

Troubles with converting CLIP to ONNX

Related topics