Use OpenAI's CLIP for style transfer

lewtun · November 15, 2021, 10:34pm

Please read the topic category description to understand what this is all about

Description

One of the most exciting developments in 2021 was the release of OpenAI’s CLIP model, which was trained on a variety of (text, image) pairs. One of the cool things you can do with this model is use it to combine text and image embeddings to perform neural style transfer. In neural style transfer, the idea is to provide a prompt like “a starry night painting” and an image, and then get the model to produce a painting of the image in that style.

The goal of this project is to learn whether CLIP can produce good paintings from text prompts.

Model(s)

The CLIP models can be found on the Hub

Datasets

For this project, you probably won’t need an actual dataset to perform neural style transfer. Just a single image should be enough to tune CLIP and an image encoder. Of course, you are free to experiment with larger datsets if you want!

Challenges

This project goes beyond that concepts introduced in Part II of the Course, so some familiarity with computer vision would be useful. Having said that, the Transformers API is similar for image tasks, so if you know how the pipeline() function works, then you’ll have no trouble adapting to this new domain.

Desired project outcomes

Create a Streamlit or Gradio app on Spaces that allows a user to provide an image and a text prompt, and produces a painting of that image in the desired style

Additional resources

You can Google “neural style transfer” to find plenty of information about this technique. Here one advanced example to give you an idea:

GitHub - orpatashnik/StyleCLIP: Official Implementation for "StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery" (ICCV 2021 Oral)

Discord channel

To chat and organise with other people interested in this project, head over to our Discord and:

Follow the instructions on the #join-course channel
Join the #neural-style-transfer channel

Just make sure you comment here to indicate that you’ll be contributing to this project

WaterKnight · November 15, 2021, 10:38pm

Very interesting topic, I am in!

DrishtiSharma · November 16, 2021, 1:45am

Hi!

I’d also like to contribute to this project. I believe it’s quite aligned with another project I’m working on, namely “Image search using CLIP”.

ydshieh · November 16, 2021, 11:11am

Hi, I would like to contribute to this project

Topic		Replies	Views
Use OpenAI's CLIP for image search 🤗 Course Projects	21	4318	June 4, 2024
I'm looking for an 'image to text' model Beginners	0	795	April 2, 2023
CLIP Image to Text search Beginners	0	896	December 19, 2022
CLIP like contrastive vision-language models for German with pre-traind text and vision models Flax/JAX Projects	5	1827	July 4, 2021
IndoClip : Pre Training Clip for Indonesian dataset Flax/JAX Projects	3	479	June 30, 2021