Similarity search with combined image and text?

neo-benjamin · June 14, 2022, 11:36pm

How can I do similarity match by combining both image and text?

Lets stay:

Product1 = Image1, Text1
Product2 = Image2, Text2

I want to do contrastive learning by combining both the image and text.

Is there such a model?

Can anyone please suggest a model?

raphaelmerx · June 20, 2022, 6:46am

The SentenceTransformer can encode images and text into a single vector space. You could combine both to create a new vector space for products, and then implement contrastive learning for this vector space.

See sentence-transformers/Image_Search.ipynb at master · UKPLab/sentence-transformers · GitHub

pcuenq · June 20, 2022, 8:17am

Like in the notebook referenced by @raphaelmerx, I also used a pre-trained CLIP model to embed images and text in the same vector space, so you can perform semantic search: Weights & Biases.

neo-benjamin · June 21, 2022, 7:17pm

@raphaelmerx Do you have a sample code for contrastive learning using SentenceTransformer?

neo-benjamin · June 21, 2022, 7:46pm

@raphaelmerx I understand the idea of combining the text and image into a single vector space and then implement contrastive learning.

But wondering are you aware of an open source implementation for doing contrastive learning? Or code that I could adapt for this purpose.

neo-benjamin · June 24, 2022, 12:34am

@raphaelmerx in the given example, you have shown model.encode to encode images and text. Do you have any example how to apply that for contrastive learning?

raphaelmerx · June 24, 2022, 4:35am

I don’t have any code sample of contrastive learning no

Topic		Replies	Views
How to use SentenceTransformers for contrastive learning? Intermediate	5	5767	June 30, 2022
How to combine Image and Text embedding for product similarity Models	2	16880	May 6, 2025
Vector search from text-image pairs : separate or common space? Intermediate	0	355	July 11, 2023
Use VisionTextDualEncoder for image-text retrieval Intermediate	0	584	December 13, 2022
Use OpenAI's CLIP for image search 🤗 Course Projects	21	4323	June 4, 2024

Similarity search with combined image and text?

Related topics