I use the fine-tuned Llama 3 - 8B model to generate answers from the question and context. I have a problem with the answer the model generates, it repeats the answer multiple times. Is there a way to solve this problem, so that the model only prints the correct answer once? I also tried changing the max_length parameter.