Hi all, I’m just getting started with semantic search and zero shot image classification. This is my first foray into data science, so please bear with me.
I am hoping that someone can help me get better results from my semantic search.
This is what I have done so far:
- Downloaded a sample set of 4000 images from unsplash
- Created an opensearch index to store the vectors as follows:
"settings": {
"index": {
"knn": true,
"knn.algo_param.ef_search": 512
}
},
"mappings": {
"properties": {
"image_vector": {
"type": "knn_vector",
"dimension": 512,
"method": {
"name": "hnsw",
"space_type": "cosinesimil",
"engine": "nmslib",
"parameters": {
"ef_construction": 512,
"m": 16
}
}
}
}
}
- Processed each image using ViT-B/32, and stored in OpenSearch
model, preprocess = clip.load("ViT-B/32", device=cpu)
def create_image_embedding(image_path):
image = preprocess(Image.open(image_path)).unsqueeze(0).to(device)
with torch.no_grad():
image_features = model.encode_image(image)
return image_features.tolist()[0]
- Create a text embedding of the search term using the same model
def create_text_embedding(text):
text = clip.tokenize([text]).to(device)
with torch.no_grad():
text_features = model.encode_text(text)
return text_features.tolist()[0]
- Query the opensearch database and retrieve the results
query = {
"size": 100,
"_source": {"excludes": ["image_vector"]},
"query": {
"knn": {
"image_vector": {
"vector": text_embedding,
"k": 100,
}
}
},
}
So, as mentioned, the results are somewhat good, but I get some not very accurate or strange results relatively high in the results. Granted, the dataset is only 4000 images, so that may be a limiting factor.
Are there any other knobs that I can tweak to make the search accuracy better?
Thanks for the help!