I have a dataset with 1 text and 10 images. I can pass 1 image with multiple text input in CLIPModel. But can I pass multiple image with single text like below?
outputs = model(text,[image1, image2, image3,......])
I have a dataset with 1 text and 10 images. I can pass 1 image with multiple text input in CLIPModel. But can I pass multiple image with single text like below?
outputs = model(text,[image1, image2, image3,......])
Yes it is possible. To get the probabilities for each image, you need to change the dim to 0 for the softmax i.e. : probs = logits_per_image.softmax(dim=0).