I have a question about RAGs. When I’m querying a VDM using an LLM how exhaustive is the search?
For example, if I’m searching a corpus of legal documents for outcomes in specific case scenarios.
Is the answer I get back based on an exhaustive search of all indexes which contain relevant matches?
Does it just pull the best X matches based on similarity?
And/ Or do all those matches make it back into the decoder LLM to inform the answer? Or is there some kind of cut-off? And if so what?
The broader question here is, how well informed will a RAG answer be? And what use cases is it good for and bad for? E.g. will the answer be informed by all the stored information or just a subset of it?