When working with a RAG system that uses previous context for answers, how does it determine whether to search the vector embeddings again or simply refine an answer from the previous conversation?
For example, in a RAG system that answers from uploaded context:
- The first time, when asking for a list of leaves, it searches the vector database and provides the answer.
- The second time, when asking to display the same list of leaves in a tabular format, it should not perform another search, but rather refine the previous answer.
How can the system distinguish between when a new search is needed versus when it should reuse and refine previous responses?