Fine-Tuning + RAG based Chatbot: Dataset Structure & Instruction Adherence Issues

:rocket: Fine-Tuning a Document-Based Chatbot – Issues and Questions

Hello everyone!
I am working on fine-tuning a chatbot that generates answers based on documents (RAG + Fine-tuning).
During the tuning process, I encountered several issues, and I would appreciate any insights or solutions from those with experience in this area.


:rocket: Question 1: How should the dataset be structured for training a document-based chatbot?
When training a model to generate document-based answers,
:white_check_mark: Should I use a question-answer dataset?
:white_check_mark: Or should I build a question-document-answer dataset?

I’d love to know the common approach!


:rocket: Question 2: Issues encountered after experimenting with two training methods

:one: Training with a Question + Answer dataset
:heavy_check_mark: The responses were natural, but hallucination (incorrect information generation) occurred.
:heavy_check_mark: The model generated answers even when the provided document contained no relevant information.
:heavy_check_mark: To prevent this, I added the following instructions to the inference-time prompt:

  • “If the document does not contain relevant information for the question, respond with: ‘Sorry, I couldn’t find any relevant information.’”
  • “End the response with: ‘Thank you :):)’”
    :heavy_check_mark: However, the model did not follow these instructions.

:pushpin: Here is the prompt I used:

alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{}

### Input:
{}

### Response:
{}"""

:heavy_check_mark: I included multiple instructions in the Instruction section, but the model did not adhere to them.
:heavy_check_mark: Additionally, I did not include documents during training but added them only at inference time.


:two: Training with a Question + Document + Answer dataset
:heavy_check_mark: When training with documents, the generated answers were strange and inconsistent.
:heavy_check_mark: In some cases, the model directly copied parts of the document instead of generating a proper response.
:heavy_check_mark: The documents I used for training were quite long—could this be the reason?


:question: What is the standard approach for training a document-based chatbot?
For tasks involving document-based answer generation, how is training typically conducted?
Is there a better approach than what I have tried?

I would really appreciate any insights or advice! :blush:

#fine-tuning #llama #rag #instruction-tuning #hallucination #dataset-preparation #inference #prompt-engineering #large-language-models #document-based-chatbot

1 Like