I use the fine-tuned Llama 3 - 8B model to generate answers from the question and context. I have a problem with the answer the model generates, it repeats the answer multiple times. Is there a way to solve this problem, so that the model only prints the correct answer once? I also tried changing the max_length parameter.
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Repetitive Answers From Fine-Tuned LLM | 10 | 1272 | July 16, 2025 | |
Llama-2 7B-hf repeats context of question directly from input prompt, cuts off with newlines | 16 | 28971 | January 10, 2025 | |
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation | 5 | 3959 | October 16, 2024 | |
Getting wrong response after fine tuning google/flan-t5-small model? | 0 | 482 | April 27, 2023 | |
How to just get the answer from Llama-2 instead of repeating the whole prompt? | 2 | 2702 | April 15, 2024 |