Use embeddings stored in vector db to reduce work for LLM generating response

scotsditch · February 19, 2024, 10:00pm

I’m trying to understand what the correct strategy is for storing and using embeddings in a vector database, to be used with an LLM. If my goal is to reduce the amount of work the LLM has to do when generating a response, (So you can think of a RAG implementation where I’ve stored text, embeddings I’ve created using an LLM, and metadata about the text.) I’m then trying to generate responses using say openai model from queries about the data, and I don’t want to have to spend a bunch of money and time chunking up the text and creating embeddings for it every time I want to answer a query about it.

If I create a vector database, for example a chroma database and I use an LLM to create embeddings for a corpus I have. I save those embeddings into the vector database, along with the text and metadata. Would the database use those embeddings I created to find the relevant text chunks, or would it make more sense for the vector database to use it’s own query process to find the relevant chunks (not using the embeddings the LLM created)?

Also do I want to pass the embeddings from the vector database to the LLM to generate the response, or do I pass the text that the vectore database found was most relevant to the LLM along with original text query so the LLM can then generate a response?

Topic		Replies	Views
What is the best approach to let LLM to learn company internal legacy system Intermediate	6	248	April 8, 2025
How to Save Vectorstore in a file and load that file in Chat Spaces	4	184	January 26, 2025
Using LLMs word embeddings within context Models	2	1186	January 25, 2024
LLM and different embeddings interaction Beginners	0	660	October 17, 2023
Understanding regarding "Question Answering model using open-source LLM" Beginners	0	1023	May 3, 2023

Use embeddings stored in vector db to reduce work for LLM generating response

Related topics