MedClip

shpotes · June 24, 2021, 11:51pm

Advancements in computer vision and deep learning techniques carry the potential to make significant contributions to healthcare. The current state-of-the-art models for automated diagnosis and outcome prediction using medical imaging tend not to consider additional information such as medical reports.

A multimodal model like CLIP pre-trained in medical data could allow new medical applications that combine text and image.

Model

Pre-trained ViT and SciBERT models can be found on the model hub.

Dataset

The MIMIC-CXR dataset can be used for this task. For privacy reasons, the dataset in question has restricted access. Anyone who wants to participate in this project must obtain the necessary credentials to access the dataset.
In my experience, getting access to MIMIC-CXR is not particularly complicated, it’s necessary to accept the terms of the license and take a short course on medical data management. It normally takes ~2 weeks to get such credentials.

Available training scripts

A training script for this will be provided soon. (see PR )

Challenges

Carrying out a proper evaluation of the model may be difficult

bharat-raghunathan · June 25, 2021, 12:19pm

I would like to join this project!

vasudev-sharma · June 25, 2021, 12:27pm

@shpotes That’s an interesting project to work on. I’ve worked on Transformers with MIMIC-CXR database earlier, and I would like to experiment how CLIP fares out.

Regarding the database, I believe that one of the team member in a group can get access to the database for working out with it.

Lets connect and form a group, if possible, to carry out this project forward.

mariagrandury · June 28, 2021, 1:01pm

Hi! I’m very interested in applying NLP to medicine and would love to join this project!

DeltaX · June 28, 2021, 1:54pm

Hi, I am Sweta from India. I am working on deep learning for medical image analysis for my msc thesis, and am generally interested in applications of AI in medicine/healthcare. With this project, I will be able to work on a new dimension, i.e., NLP in healthcare. Hence, I am ver interested in joining this project and working with everyone to hone my NLP skills.
My time zone is IST(GMT + 5:30).

valhalla · June 28, 2021, 4:24pm

That’s a great idea - let’s officially define this project then

Putting everybody in the official sheet here. More people can still join! Leave a comment here or in the sheet if you want to change something.

shngt · June 28, 2021, 4:35pm

I would like to join this project as well!

valhalla · June 28, 2021, 4:39pm

Awesome! Added you to the team

DeltaX · June 28, 2021, 4:54pm

Hi, may I join as well?

valhalla · June 28, 2021, 5:04pm

Sure! Just added you the team

shpotes · June 28, 2021, 11:36pm

According to the dataset license, It is not possible to share the dataset with anyone else (I assume that also includes any participant in the project).

hooman650 · June 29, 2021, 6:04am

Hey, have been also doing a lot of deep learning in healthcare space, would love to join this!

jdposa · June 29, 2021, 2:05pm

Hi,

I am very interested in this topic as well. I work as Sr. Clinical Data Scientist for Stanford.

Keep me posted in how I can contribute

patrickvonplaten · June 29, 2021, 3:11pm

It would be important to see if the data can be used and if so how! Also maybe it might make sense to fine-tune the official CLIP weights on the medical data instead of pretraining from scratch ? @valhalla

hooman650 · June 29, 2021, 5:37pm

I also think that it might make more sense to finetune the official CLIP weights! Applying for the data might take a few days though so if this project is going to be selected, we might want to take that into account!

shpotes · June 29, 2021, 8:36pm

I think this initiative can qualify as “lawful use in scientific research”, so I don’t think there is any problem. In any case, I can communicate directly with the license owners and ask them about it.

Also maybe it might make sense to fine-tune the official CLIP weights on the medical data instead of pretraining from scratch?

Considering the amount of data available, fine-tuning will probably work better. The main reason why I proposed to train it from scratch instead of finetuning is the vocabulary. Since the medical vocabulary stands out for its large number of unusual terms in more standard domains with which standard tokenizers tend to have problems (See for instance Beltagy et al., (2019)).
I suppose that techniques such as recycling (de Vries & Nissim, 2021) or Adapters (Houlsby et al., 2019; Pfeiffer et al., 2020), could solve this problem.

edugp · June 30, 2021, 7:38am

hi! I would be very interested in joining this project! I am a ML Engineer at Ferrum Health - a healthcare startup in San Francisco, working on both NLP and computer vision. I am familiar with DICOM and I am currently working with another of the MIMIC datasets (MIMIC-III).

valhalla · June 30, 2021, 8:18am

As Patrick said, please see if the data can be made available before the sprint!

And regarding fine-tuning and medical vocabulary I think in this case we could maybe use a text encoder which is trained on medical data and then pair it with CLIP’s vision model instead of starting from scratch.

valhalla · June 30, 2021, 8:18am

Added you the team @kaushalya and @edugp

abheesht · June 30, 2021, 10:50am

Hey, I find this project really interesting. Recently, I’ve been looking to delve deeper into Clinical NLP. Is there a vacancy?

Topic		Replies	Views
IndoClip : Pre Training Clip for Indonesian dataset Flax/JAX Projects	3	479	June 30, 2021
StyleGAN2 for medical datasets Flax/JAX Projects	21	2983	May 1, 2022
Calling healthcare AI devs: do you struggle with access to clinical data? 🤗Datasets	6	34	July 14, 2025
Vision-Language Project Ideas Flax/JAX Projects	13	1549	June 30, 2021
CLIP like contrastive vision-language models for German with pre-traind text and vision models Flax/JAX Projects	5	1828	July 4, 2021

MedClip - Pretraining CLIP on medical data

MedClip

Model

Dataset

Available training scripts

Challenges

Related topics