Behaviour of RAG on greetings

dksensei · February 12, 2025, 7:18am

Hi, I am building a RAG app for my codebase. I want my RAG to differentiate between queries that need context and those that don’t.
Basic prompts like “Hi” or “Who are you?” can be answered without context from the codebase. However, due to the presence of context, the RAG is not handling these prompts correctly.

Currently, I am trying to differentiate between these queries using Llama 3.2, but it is producing false positives (indentifying queries that needs context as BASIC QUERY) . Can someone suggest a better approach for this?

def generate_clarification_question(query, retrieved_docs, previous_conversation_context=""):
    """
    Uses Llama 3.2 to determine if additional context is required.
    If the query is basic (e.g., greetings or simple identity questions), it returns "NO_CLARIFICATION_NEEDED".
    Otherwise, it generates a refined query for more specific backend retrieval.
    """
    prompt = f"""
    User Query:
    "{query}"
    
    Instructions:
    1. If the user's query is a simple greeting (e.g., "Hi", "Hello") or a basic identity inquiry (e.g., "Who are you?", "Who am I?"), respond with exactly: "BASIC_QUERY_DETECTED".
    2. Any other query requires further clarification.
    3. Generate a refined query that can be used to fetch better results from the database index.
    4. The clarification question should be based on the current context and user query, incorporating technical terms and relevant repository names.
    
    Provide only the necessary output.
    """
    clarification = query_ollama(prompt, light_model)
    return clarification.strip()

greos · March 14, 2025, 3:29am

Hi dksensei,
Small models like llama3.2 tend to struggle with queries that contain multiple decisions. What I find works better is to break it down to multiple steps with just one decision for it to make - ideally just a TRUE or FALSE or a “YES” or “NO” and a simple reason to help with debugging. I’d also keep the temperature value low for this step. Here is how I have accomplished it for one of my chatbots in relation to if search a web search is needed:

Here’s an open source repo to a chatbot I developed. Check out the CognitiveProcessing.py file for ideas on similar prompts if you like:

Note: It still may not be a perfect answer every time as it is a small model. So if accuracy is important for this step, I would suggest trying a larger model like Mistral Nemo.

Topic		Replies	Views
How can I prompt Llama to only use my provided context? 🤗Transformers	1	1656	March 2, 2024
Llama-2 7B-hf repeats context of question directly from input prompt, cuts off with newlines 🤗Transformers	16	28882	January 10, 2025
Simple generative question answering with context Beginners	5	2750	August 16, 2024
Help with preparing train data for fine-tuning llama 3.1 instruct model? Models	0	98	October 27, 2024
Confusion regarding when to use dict-styled chat dialogue vs. when to format using chat template Intermediate	0	42	November 6, 2024

Behaviour of RAG on greetings

Related topics