How to give CLIP only image as input? (without text)

Hello everyone! I’ve converted my CLIP model to ONNX, and then wanted to run an inference, but it doesn’t work without text in input… Is it possible to fix that?