Llama-2 7B-hf repeats context of question directly from input prompt, cuts off with newlines

kanasva · October 2, 2023, 2:26pm

I am not sure I understand your question well or not. If you want the answer from Llama 2 to not include the prompt you provide, you can use return_full_text=False

sequences = pipeline(
    myPrompt,
    do_sample=True,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
    max_length=4096, # max lenght of output, default=4096
    return_full_text=False, # to not repeat the question, set to False
    top_k=10, # default=10
    # top_p=0.5, # default=0.9
    temperature=0.6, # default=0.
)```

Topic		Replies	Views
Llama-2-70b-chat-hf Model is adding irrelevant topics to output Models	0	679	October 20, 2023
RetrievalQA output repeats prompt and context sources Models	0	82	July 26, 2024
Generate questions from a given context Beginners	0	615	October 19, 2023
Repetition Issues in Llama Models (3:8B, 3:70B, 3.1, 3.2) Models	1	532	March 5, 2025
Llama-2-7b-chat fine-tuning Models	4	6785	April 26, 2024

Llama-2 7B-hf repeats context of question directly from input prompt, cuts off with newlines

Related topics