I finetuned llama2-ko-7b with LORA for Answering Questions based on the Context.
My training data was jsonl file with multiple texts:
- Example: { “text”: “
### Instruction:\n{question}\n\n### Input:\n{context}\n\n### Response:\n{answer}.” }
Model was trained for 20 epochs and I am trying to inference on triton server
I am facing output text issue!
The output always generates [\n\n\n or ### or Input] after the first sentence.
-
I tried:
“max_tokens”: 30,
“bad_words”: [“\n\n###”, “###”],
“stop_words”: [“\n\n###”, “.”, “!”],
“pad_id”: 2,
“end_id”: 2,
“streaming”: 1,
“early_stopping”: true,
“temperature”: 1.0,
“top_k”: 50,
“top_p”: 0.92,
“no_repeat_ngram_size”: 3,
“eos_token_id”: 2,
“num_beams”: 1,
“do_sample”: true
}’ -
Example: “text_output”:“경관계획은 실시설계를 완료하기 전에 수립해야 합니다. \n\n##\n\n \t\n\n \t\n\n \t”
Q: How can I prevent this issue during inference?