Domain adaptation fine tune VS instruction_tuned

RonYad · November 7, 2023, 11:56am

TL;DR: I want to fine-tune ‘meta-llama/Llama-2-7b-chat-hf’ with my own instructional data, but it turns out that the new model is worse than the original one.

How should I differentiate in the code between different types of fine-tuning? I saw different types of fine-tuning on the internet:

Domain adaption
Instruction-based fine-tuning (it looks like this is what is needed)
Supervised fine-tuning …
But how should it be different in the code? For example, assuming that I use AutoModelForCausalLM and a trainer object, a simple code would look like this:


model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-2-7b-chat-hf")
    trainer = Trainer(

        model=model,
        args=training_args,
        train_dataset=dataset_train,
        eval_dataset=dataset_test,
        data_collator=transformers.DataCollatorForLanguageModeling(
            tokenizer, mlm=False
        )

But how dataset_train should look like ?

cfrancois7 · January 19, 2024, 10:06am

Hi, I’ve you found the answer at your question?
I’m interested by your experience.

RonYad · January 21, 2024, 1:20pm

Hi cfrancois7,

I have discovered the answer to your question.

Essentially, what trainer.train() does is attempt to predict the next token. Therefore, when you provide to the trainer a train_dataset or eval_dataset that consists of a list with a token or string in each element, the goal is invariably to predict the subsequent token.

From this perspective, there is no distinction between Domain Adaptation and Instruction-based Fine-tuning – the difference lies solely in the data.

In Domain Adaptation, the data usually takes the form of a single, very long text that has been concatenated. The objective remains next-token prediction, where the context is provided in windows that typically have a specified context length parameter (e.g., 2K, 4K, …). This stage often encompasses the bulk of the learning due to the extensive size of the data.

When it comes to Instruction Tuning, it usually involves a few thousand pairs of prompts and completions aimed at teaching the model how to respond to instructions. However, generally, there is no difference in the learning method – the model always attempts to predict the next token, whether it is part of the prompt or the completion.

Please note that you can optionally configure the model to learn only from the completion part by using the SFTTrainer from TRL instead of the regular Trainer. Essentially, SFTTrainer is designed to override Trainer.
For more information, see Train on Completions Only.

Topic		Replies	Views
Instruction tuning llm Beginners	8	12324	May 8, 2024
Train llama on domain specific dataset and on instruction format dataset Models	1	2286	October 28, 2023
Using same instructions for fine-tuning: Is this bad for the model? Intermediate	1	458	March 26, 2024
Problems with understanding instruction fine-tuning Beginners	0	450	April 2, 2024
Adding domain knowledge in LLMs via fine tuning Research	2	5583	July 23, 2023

Domain adaptation fine tune VS instruction_tuned

Related topics