@radames, i wanna deploy sentence-transformers/clip-ViT-B-32 · Hugging Face as inference endpoint. But can’t figure out to make it accept image. I know that locally, this model can accept both image and text, but how do i make it on HF deployed endpoint ?
My goal is to generate embeddings from an image…