Use OpenAI's CLIP for image search

:wave: Please read the topic category description to understand what this is all about

Description

One of the most exciting developments in 2021 was the release of OpenAI’s CLIP model, which was trained on a variety of (text, image) pairs. One of the cool things you can do with this model is use it for text-to-image and image-to-image search (similar to what is possible when you search for images on your phone).

The goal of this project is to experiment with CLIP and learn about multimodal models. Several ideas can be explored, including:

  • Create a text-to-image search engine that allows users to search for images based on natural language queries. Although CLIP was only trained for English text, you can use techniques like Multilingual Knowledge Distillation to extend the embeddings to new languages
  • Create an image-to-image search engine that returns similar images, given a “query” image.

Model(s)

The CLIP models can be found on the Hub

Datasets

A common dataset that’s used for image demos is the Unsplash Dataset. You can get access to it here

Challenges

This project goes beyond that concepts introduced in Part II of the :hugs: Course, so some familiarity with computer vision would be useful. Having said that, the :hugs: Transformers API is similar for image tasks, so if you know how the pipeline() function works, then you’ll have no trouble adapting to this new domain.

Desired project outcomes

  • Create a Streamlit or Gradio app on :hugs: Spaces that allows a user to find images that resemble a natural language query or input image.
  • Don’t forget to push all your models and datasets to the Hub so others can build on them!

Additional resources

Discord channel

To chat and organise with other people interested in this project, head over to our Discord and:

  • Follow the instructions on the #join-course channel=
  • Join the #image-search channel (currently full!)
  • Join the #image-search-group2 channel

Just make sure you comment here to indicate that you’ll be contributing to this project :slight_smile:

2 Likes

Hi @lewtun ! :slight_smile: I’d love to work on this project.

1 Like

I would also like to participate in this project :grinning:

2 Likes

I am interested in it, please share channel name :heart_eyes:

1 Like

Hey I would be interested in working on this project :grinning:

1 Like

Awesome, 4 people already! You can head over to Discord if you want to coordinate / chat etc :slight_smile:
I’ve added the project name in this topic’s description

2 Likes

Don’t forget to all fill out the AWS form to get access to an account for the free compute on SageMaker!

3 Likes

Hi , I am also interested in this project !! :blush:

Hey @RobotJelly, I think we already have 4 people in this project (the team limit), so you can either join this similar project or work on this one by yourself

Hi , @lewtun ok so can you please suggest me the beginner project where i can work by myself only (without any other team member) ?

Hi @RobotJelly the only constraint is that we’re reserving the Amazon SageMaker compute for teams, so if you have your own GPUs / cloud provider, then your more than welcome to work on this project by yourself :slight_smile:

oh ok @lewtun actually i dont have any cloud provider service so hmm… actually i dont have idea that team is really needed & thats why i’ve submitted the form also with this project title

so should i again fill the form with other project ? also can you suggest me that project where team is available

Hey @RobotJelly here’s a few free alternatives for GPUs:

If those aren’t an option for you, I recommend checking through the #course:course-event category and seeing if there’s an idea that interests you and doesn’t have anyone signed up (just check the comments).

Alternatively you are more than welcome to propose a project of your own!

1 Like

thank you for sharing . will surely check these alternatives :slight_smile: :hugs:

1 Like

@RobotJelly Lets do together the image search project as a second group? Is that ok @lewtun ? Or should we select a complete new project? This relates to an idea I have connected to my current job.

2 Likes

I think it’s okay to have a second group working on this :slight_smile: I have create a team for you and @RobotJelly. If anyone else join you, the code is use-openais-clip-for-image-search-group2 (when filling the name of the project in the AWS form).

2 Likes

Great! Ok I am ready :muscle: though i am quite a beginner in NLP related things so going through the course content as well :slight_smile: do let me know when to start ?

Thanks Sylvain. I just fill the AWS form, now with use-openais-clip-for-image-search-group2 as project code.

Hi @marcelcastrobr @sgugger , this sounds like a very interesting project, can I also join your second team on use-openais-clip-for-image-search-group2 project as well?