Llama-2 7B-hf repeats context of question directly from input prompt, cuts off with newlines

I am not sure I understand your question well or not. If you want the answer from Llama 2 to not include the prompt you provide, you can use return_full_text=False

sequences = pipeline(
    myPrompt,
    do_sample=True,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
    max_length=4096, # max lenght of output, default=4096
    return_full_text=False, # to not repeat the question, set to False
    top_k=10, # default=10
    # top_p=0.5, # default=0.9
    temperature=0.6, # default=0.
)```
5 Likes