RetrievalQA output repeats prompt and context sources

marky2376 · July 26, 2024, 6:51pm

I have a python script that leverages langchain, huggingface and Llama3 to serve as a RAG for answering questions on our private data and to fallback to the LLM as well.

The output includes the prompt and also list all the context used in the answer. Is there a way to limit it from repeating the prompt and to hide the context sources?

Here is the goods on the code:

 # Initialize the RetrievalQA chain
    qa_chain = RetrievalQA.from_chain_type(
        llm=llm, chain_type="stuff", retriever=index.as_retriever()
    )

    result = qa_chain(query)
    print(result)


def main():
    query = input("Type in your question: \n")
    while query != "exit":
        query_docs(query)
        query = input("Type in your question: \n")


if __name__ == "__main__":
    main()

Topic		Replies	Views
Llama-2 7B-hf repeats context of question directly from input prompt, cuts off with newlines 🤗Transformers	16	28930	January 10, 2025
Llama 2 repeats its prompt as output without answering the prompt 🤗Transformers	3	3626	September 30, 2024
Getting repetative response using ConversationalRetrievalChain + HugginFaace Beginners	0	167	April 29, 2024
RAG LLM Generating the Prompt also at the response Beginners	8	4227	September 25, 2024
Repetition Issues in Llama Models (3:8B, 3:70B, 3.1, 3.2) Models	1	544	March 5, 2025

RetrievalQA output repeats prompt and context sources

Related topics