How to quantize and run inference for CLIP using optimum

I exported a clip model to ONNX using:

optimum-cli export onnx -m laion/CLIP-ViT-L-14-laion2B-s32B-b82K --framework pt clip_onnx

And I further tried to quantize it by:

optimum-cli onnxruntime quantize --onnx_model clip_onnx/ --arm64 -o quantized_clip_onnx

however, I am not sure how to run inference using my quantized_clip_onnx

Most code sample started from something like:

ORTModelForSequenceClassification.from_pretrained(…)

however, there doesnt seem to exist an ORTModel*** for CLIP related family of models. I am wondering if anyone has succeeded and point me out to a solution.

Thanks.

cc @merve who worked on Optimum