Question regarding CLIP's model open-sourceness and commerciality

Hello. I’m a beginner here and was really in doubt about CLIP’s and all the finetuned versions of CLIP license. Sentence-transformers clip-ViT-B-32-multilingual-v1 for example has an APACHE 2 license, but is based on OPENAI’s CLIP has a MIT license on github but they state that " Any deployed use case of the model - whether commercial or not - is currently out of scope.". This is really confusing as I need to make use of open source models for a project ( won’t be commercialized probably, but if it will be then this would be a problem? ).

I’m currently working on a project where I need to work with both text + images and perform some similarity searches. CLIP gives me some headache cause I’m really confused whether I can use it for this kind of stuff or not. My other choice ( I don’t know too much about this tho) would be to use two separate models ( One for text embeddings, one for img embeddings) and have them in two different spaces and perform the searches in the appropriate space ( either for text , or for images) and then somehow add them up together, but I’m not sure as to how to proceed.


OpenAI’s repository mentions commercial usage allowed: CLIP/LICENSE at main · openai/CLIP · GitHub. The model card just mentions that they haven’t assessed commercial applications when releasing the model.

That would mean that I can use “sentence-transformers/clip-ViT-B-32-multilingual-v1” for example in a commercial app without any legal issues? Thanks !!

Yes, since has an Apache 2.0 license.