Is there a vision model for zero-shot clustering?

rileybol · May 26, 2024, 1:25pm

I have a dataset of images from The Simpsons. I have trained a face-detection model for Simpsons characters with good results, and after lots of experimenting, I have written a script that gives relatively accurate binary images of the faces. I will attach a screenshot as an example. I have also tried using cv2.findContours to find the contours and treat these as matrices to compute the difference between 2 faces, with no luck.

My end goal is to be able to cluster these faces by character. I have tried some more basic ML algorithms without success, and now I think this task may be too complex for that.

I am wondering if there is a vision model that could be well-suited for this? Or if anyone has suggestions for other approaches that would be great too.

Here is an example of my processed Simpsons faces that I want to cluster:

As a side note, I don’t care that much if, for example, images of Bart Simpson from a front angle end up in a different cluster from images of him from a side angle, as it will be easy enough to manually merge these clusters after the fact.

Topic		Replies	Views
Sort Images by Similarity Using Computer Vision Beginners	6	544	October 10, 2024
Is it possible to disassemble a zero-shot model? Intermediate	0	449	March 3, 2022
Zero shot image classification for industrial equipment Beginners	0	130	February 20, 2024
Improving semantic search with zero shot image classification Beginners	0	193	April 17, 2024
Adapting BLIP2 for zero-shot classification 🤗Transformers	3	1471	August 8, 2024

Is there a vision model for zero-shot clustering?

Related topics