I exported a clip model to ONNX using:
optimum-cli export onnx -m laion/CLIP-ViT-L-14-laion2B-s32B-b82K --framework pt clip_onnx
And I further tried to quantize it by:
optimum-cli onnxruntime quantize --onnx_model clip_onnx/ --arm64 -o quantized_clip_onnx
however, I am not sure how to run inference using my quantized_clip_onnx
Most code sample started from something like:
ORTModelForSequenceClassification.from_pretrained(…)
however, there doesnt seem to exist an ORTModel*** for CLIP related family of models. I am wondering if anyone has succeeded and point me out to a solution.
Thanks.