Use OpenAI's CLIP for image search

lewtun · November 10, 2021, 4:51pm

Please read the topic category description to understand what this is all about

Description

One of the most exciting developments in 2021 was the release of OpenAI’s CLIP model, which was trained on a variety of (text, image) pairs. One of the cool things you can do with this model is use it for text-to-image and image-to-image search (similar to what is possible when you search for images on your phone).

The goal of this project is to experiment with CLIP and learn about multimodal models. Several ideas can be explored, including:

Create a text-to-image search engine that allows users to search for images based on natural language queries. Although CLIP was only trained for English text, you can use techniques like Multilingual Knowledge Distillation to extend the embeddings to new languages
Create an image-to-image search engine that returns similar images, given a “query” image.

Model(s)

The CLIP models can be found on the Hub

Datasets

A common dataset that’s used for image demos is the Unsplash Dataset. You can get access to it here

Challenges

This project goes beyond that concepts introduced in Part II of the Course, so some familiarity with computer vision would be useful. Having said that, the Transformers API is similar for image tasks, so if you know how the pipeline() function works, then you’ll have no trouble adapting to this new domain.

Desired project outcomes

Create a Streamlit or Gradio app on Spaces that allows a user to find images that resemble a natural language query or input image.
Don’t forget to push all your models and datasets to the Hub so others can build on them!

Additional resources

https://www.sbert.net/examples/applications/image-search/README.html

Discord channel

To chat and organise with other people interested in this project, head over to our Discord and:

Follow the instructions on the #join-course channel=
Join the #image-search channel (currently full!)
Join the #image-search-group2 channel

Just make sure you comment here to indicate that you’ll be contributing to this project

DrishtiSharma · November 15, 2021, 3:07pm

Hi @lewtun ! I’d love to work on this project.

JLD · November 15, 2021, 6:14pm

I would also like to participate in this project

amir22010 · November 15, 2021, 7:39pm

I am interested in it, please share channel name

ubamba98 · November 15, 2021, 8:50pm

Hey I would be interested in working on this project

lewtun · November 15, 2021, 10:08pm

Awesome, 4 people already! You can head over to Discord if you want to coordinate / chat etc
I’ve added the project name in this topic’s description

sgugger · November 15, 2021, 10:19pm

Don’t forget to all fill out the AWS form to get access to an account for the free compute on SageMaker!

RobotJelly · November 16, 2021, 5:58am

Hi , I am also interested in this project !!

lewtun · November 16, 2021, 9:34am

Hey @RobotJelly, I think we already have 4 people in this project (the team limit), so you can either join this similar project or work on this one by yourself

RobotJelly · November 16, 2021, 9:57am

Hi , @lewtun ok so can you please suggest me the beginner project where i can work by myself only (without any other team member) ?

lewtun · November 16, 2021, 9:59am

Hi @RobotJelly the only constraint is that we’re reserving the Amazon SageMaker compute for teams, so if you have your own GPUs / cloud provider, then your more than welcome to work on this project by yourself

RobotJelly · November 16, 2021, 10:02am

oh ok @lewtun actually i dont have any cloud provider service so hmm… actually i dont have idea that team is really needed & thats why i’ve submitted the form also with this project title

RobotJelly · November 16, 2021, 10:04am

so should i again fill the form with other project ? also can you suggest me that project where team is available

lewtun · November 16, 2021, 10:09am

Hey @RobotJelly here’s a few free alternatives for GPUs:

Google Colab: https://colab.research.google.com/
Kaggle notebooks: Notebooks Documentation | Kaggle
Paperspace Gradient: Free GPU

If those aren’t an option for you, I recommend checking through the #course:course-event category and seeing if there’s an idea that interests you and doesn’t have anyone signed up (just check the comments).

Alternatively you are more than welcome to propose a project of your own!

RobotJelly · November 16, 2021, 12:41pm

thank you for sharing . will surely check these alternatives

marcelcastrobr · November 16, 2021, 6:25pm

@RobotJelly Lets do together the image search project as a second group? Is that ok @lewtun ? Or should we select a complete new project? This relates to an idea I have connected to my current job.

sgugger · November 17, 2021, 1:43am

I think it’s okay to have a second group working on this I have create a team for you and @RobotJelly. If anyone else join you, the code is use-openais-clip-for-image-search-group2 (when filling the name of the project in the AWS form).

RobotJelly · November 17, 2021, 7:24am

Great! Ok I am ready though i am quite a beginner in NLP related things so going through the course content as well do let me know when to start ?

marcelcastrobr · November 17, 2021, 7:34am

Thanks Sylvain. I just fill the AWS form, now with use-openais-clip-for-image-search-group2 as project code.

ysharma · November 17, 2021, 8:43am

Hi @marcelcastrobr @sgugger , this sounds like a very interesting project, can I also join your second team on use-openais-clip-for-image-search-group2 project as well?

Topic		Replies	Views
Use OpenAI's CLIP for style transfer 🤗 Course Projects	3	3260	November 16, 2021
Image neural search 🤗 Course Projects	2	702	November 15, 2021
CLIP Image to Text search Beginners	0	897	December 19, 2022
CLIP like contrastive vision-language models for German with pre-traind text and vision models Flax/JAX Projects	5	1828	July 4, 2021
IndoClip : Pre Training Clip for Indonesian dataset Flax/JAX Projects	3	479	June 30, 2021