I want to know relationship among LLM, Prompt, RAG, Prompt Engineering, Metadata.
My boss asked me to find out about this topic for out new projects.
I know roughly about each one, but I’m not sure about the this relationship.
My boss asked me to find out how LLM Prompt/Prompt Engineering work with RAG & Metadata, and also instructed me to create a schematic for the topic.
Please help me and explain the details.
Thank you.
Retrieval Augmented Generation: a process in which an LLM’s generations are augmented by a specific knowledge base
- The idea is that instead of just using the knowledge that the LLM has gathered during training, you want the LLM to answer with knowledge that you specifically provide it with
- The hope is that this will enable more accurate, attributable, and often times up-to-date responses
- The benefit is that this is much more cost-efficient than retraining or fine-tuning your LLM
- The general process consists of:
- retrieval from a knowledge base (vector database, etc.)
- augmentation (typically you augment the prompt fed to the LLM so that it encapsulates the knowledge you have retrieved)
- generation (the LLM generates with this augmented prompt, which provides the model with additional context)
Prompt Engineering: in very broad terms refers to the study of how you should construct your inputs to guide the LLM to output in a certain way
An overview, from my understanding, would be something like this:
Not entirely sure what exactly your boss meant with “metadata” but he may have been speaking within the context of a typical vector database, which is a specific type of knowledge base that was specifically designed to deal with the storage and organization of unstructured data.
LLMs use prompts to generate responses, with prompt engineering optimizing these inputs. RAG integrates external info retrieved based on prompts, while metadata provides context to enhance retrieval accuracy.
But why do we need prompts exactly? Can’t a special token achieve the same thing. For eg in a q/a finetuning If we add tags [Q] and [A], won’t the model automatically understand to answer any given question with the [Q] tag?
Why do we need to add a prompt like “you are a question answering bot…”. Does the prompt actually help the model perform the task better?
Simple example, LLM is your brain, Prompt is your thought process that gives output in accordance with your thoughts, RAG is when your brain is reading a book/ external knowledge to answer a face and metadata is simply some information about the data.
Can’t a special token achieve the same thing. For eg in a q/a finetuning If we add tags [Q] and [A], won’t the model automatically understand to answer any given question with the [Q] tag?
Large language models are trained on countless textual data in an unsupervised manner. That is, LLMs “learn” by essentially extracting linguistic characteristics or patterns from patterns that have not specifically been labelled. And they do this very well.
Assuming that no additional fine-tuning has been done, a LLM is more likely to have seen data in the form of “Question:” / “Answer:” than it is to have seen “[Q]” / “[A]”. And the hope is that with data in forms it has seen more often, it will more reliably produce an output that behaves as we might expect.
In terms of adding specific tags like [Q] or [A], one approach might be to fine-tune a LLM to utilize and make sense of those tags. But this is a bit different from having the model “automatically understand to answer any given question with the [Q] tag”. It is better to think of fine-tuning as a way to “guide” the model to generate in a specific way. After fine-tuning, a model is more likely to generate [A] given input [Q], but this is not a guarantee. The model just learns that this is most “likely” a suitable generation given the input.
So while the prior may be true after fine-tuning, the latter is not.
Given [Q], a model is more likely to generate [A]
Given [Q], a model will always generate [A]
All this to say that there is no real way to “fix” model outputs.
Why do we need to add a prompt like “you are a question answering bot…”. Does the prompt actually help the model perform the task better?
Prompt engineering is, broadly, the study of how one might engineer inputs to induce specific outputs. That is, if I want to see some result Y, what is the best way to construct the input X to do so?
The catch is that there is no real “surefire” prompt that always gets all LLMs to behave in a particular way. Imagine that you have 5 different kids trying to learn a particular subject. Some of them might do well in one subfield, while the others do well in another. One of the kids might be a visual learner. In the same way, different LLMs behave differently, even with the same structured input.
You may have, however, heard that specific input formats seem to yield better results in general. For example, a commonly used format is the Markdown format, which was used in fine-tuning the Stanford Alpaca model. Such formats are often used because they serve as a pretty good starting point for experimentation. Think of them as a “rules of thumb” that work pretty well for your typical case.
Returning to your question, the exact phrase “you are a question answering bot…” may not seem very important. But one finding in prompt-engineering was that assigning roles to a LLM can yield generations with certain characteristics (for example, they may be more descriptive). A brief explanation about “Assigning Roles” can be seen here.
But why do we need prompts exactly?
Prompts are essentially a way of guiding your LLM to generate in a specific manner. The hope is that the different components (sentences) will induce certain behaviors that ultimately synergize and empower your LLM to yield results that you are satisfied with. What works best, in your particular context, can really only be determined with lots of experimentation.