How to calculate the price of hosting transformers for semantic search

I have been asked from my manager to discuss the cost of building the following
Semantic search engine for car images and car parts
Semantic search but for words not images
Recommendation system
I love qdrant and want to use it in the production, for our business case i must declare some baseline for the projects before starting working on them
I don’t have enough experience with cloud computing, so can someone give any guidance to manage this discussion, i mainly want to use qdrant but their cloud price for my region and the manager i work is really high!!!

  1. How big is your dataset?
  2. And how are you creating the embeddings, I.e. converting your images and docs to vectors.
  3. will you be using existing model or training/finetuning one for creating embeddings
  4. Then you layer in things like redundancy, replication, performance, security which should be priced into the cloud offering to decide if it’s “worth” it.
  1. The dataset is not known! but awsome for semantic search for image they are 1M images and for semantic search for text i will just host an mbert model
  2. the embedding i will use CLIP for images semantic search and for text i will use the following model medmediani/Arabic-KW-Mdel · Hugging Face
  3. use models from huggingface
    i want to store the embedding in vector database like qdarnt
    the info i know for now is
    . Let’s say, you’ve 1M images and 1M text pieces.

Let’s assume, we use something like CLIP Embedding for this → 768 dimensions

For each 1M, to have the lowest latency possible i.e. have everything in RAM – you’d want to index about 2.86 GB of vectors. So leaving some room for the index: We’d need about ˜5 GB of RAM.

Assume you’re using a 8G RAM machine on AWS us-east-1, this’d be $0.0504 hourly on the most expensive end of things.

With better configs around: Storing some payload on disk, you can use a 4G machine – or $0.0385 hourly

This is still on the more expensive end of things, and if you can share a bit more about how many RPS you’d need – we can do a more accurate pricing discussion with you
I think for storing the embedding i know the cost but want to know your answer and for hosting the models in a server i don’t know which one and which gpu should i use …etc any information or guidance will help me alot!

1 Like