Fine Tuning with Alpaca vs Chat Template

realdanielbyrne · December 12, 2024, 9:53pm

I have some confusion as to when to transform a dataset using model specific chat templates. Is it always advisable to apply data to a model specific chat template before training on that data? For instance Alpaca datasets are often formatted with a formatting function like the following. However, the code below doesn’t apply a model specific chat template, but more of a generic one. However, chat or instruct models are trained on very specific templates. Wouldn’t using something like the code below just confuse the model? Yet for instance the SFT tutorial recommends this specific function? I feel like I’m getting mixed messages.

def formatting_prompts_func(examples):
output_text =
for i in range(len(examples[“instruction”])):
instruction = examples[“instruction”][i]
input_text = examples[“input”][i]
response = examples[“output”][i]

    if len(input_text) >= 2:
        text = f'''Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
        
        ### Instruction:
        {instruction}
        
        ### Input:
        {input_text}
        
        ### Response:
        {response}
        '''
    else:
        text = f'''Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
        
        ### Instruction:
        {instruction}
        
        ### Response:
        {response}
        '''
    output_text.append(text)

return output_text

Topic		Replies	Views
Cannot use apply_chat_template() because tokenizer.chat_template is not set Beginners	6	4599	December 1, 2024
SFT Trainer and chat templates Beginners	3	461	March 26, 2025
Tokenizer.apply_chat_template vs formatting_func Beginners	0	198	July 2, 2024
Dataset format standards for chat-based, fine-tuned Llama models 🤗Datasets	2	6147	February 16, 2024
Confusion regarding when to use dict-styled chat dialogue vs. when to format using chat template Intermediate	0	42	November 6, 2024

Fine Tuning with Alpaca vs Chat Template

Related topics