Fine-Tuning a Document-Based Chatbot – Issues and Questions
Hello everyone!
I am working on fine-tuning a chatbot that generates answers based on documents (RAG + Fine-tuning).
During the tuning process, I encountered several issues, and I would appreciate any insights or solutions from those with experience in this area.
Question 1: How should the dataset be structured for training a document-based chatbot?
When training a model to generate document-based answers,
Should I use a question-answer dataset?
Or should I build a question-document-answer dataset?
I’d love to know the common approach!
Question 2: Issues encountered after experimenting with two training methods
Training with a Question + Answer dataset
The responses were natural, but hallucination (incorrect information generation) occurred.
The model generated answers even when the provided document contained no relevant information.
To prevent this, I added the following instructions to the inference-time prompt:
- “If the document does not contain relevant information for the question, respond with: ‘Sorry, I couldn’t find any relevant information.’”
- “End the response with: ‘Thank you :):)’”
However, the model did not follow these instructions.
Here is the prompt I used:
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
{}
### Input:
{}
### Response:
{}"""
I included multiple instructions in the
Instruction
section, but the model did not adhere to them.
Additionally, I did not include documents during training but added them only at inference time.
Training with a Question + Document + Answer dataset
When training with documents, the generated answers were strange and inconsistent.
In some cases, the model directly copied parts of the document instead of generating a proper response.
The documents I used for training were quite long—could this be the reason?
What is the standard approach for training a document-based chatbot?
For tasks involving document-based answer generation, how is training typically conducted?
Is there a better approach than what I have tried?
I would really appreciate any insights or advice!
#fine-tuning #llama #rag #instruction-tuning #hallucination #dataset-preparation #inference #prompt-engineering #large-language-models #document-based-chatbot