Finetuning Meta-Llama-3.1-8B using PEFT

Anilissac · September 24, 2024, 7:13am

I’ve been working on code to fine-tune (locally) the meta-llama/Meta-Llama-3.1-8B or meta-llama/Meta-Llama-3.1-8B-Instruct model that I downloaded in local server, using the example from the Hugging Face repository at huggingface/huggingface-llama-recipes. I made some adjustments to the code to fit my custom dataset.

While the training process seems to complete successfully, I’m encountering an issue during inference: the model generates gibberish responses, even for general questions.
Sample output

Question: What is the capital of France?
Answer: What is the capital of France?ders Шев麦 Roy Yoursacht_NCdersders Rakiset Roy

I’m not sure what might be wrong with the code or if I made any mistakes in the implementation.
Should I consider using a different model, or what additional steps can I take to ensure the fine-tuning works effectively?

Note: I downloaded the model by excluding the original checkpoint and only retrieved the main folder.

huggingface-cli download meta-llama/Meta-Llama-3.1-8B --exclude "original/*" --local-dir Meta-Llama-3.1-8B

This this updated code.

import torch
import json
from datasets import load_dataset, Dataset

from trl import SFTTrainer
from peft import LoraConfig
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig, TrainingArguments

def load_and_flatten_dataset(file_path):
    with open(file_path, "r") as f:
        raw_data = json.load(f)
    formatted_data = []
    for entry in raw_data:
        context = entry['context']
        for q in entry['questions']:
            formatted_data.append({
                "text": f"Context: {context}\nQuestion: {q['question']}\nAnswer: {q['answer']}"
            })

    return Dataset.from_dict({"text": [d['text'] for d in formatted_data]})

def tokenize_dataset(dataset, tokenizer):

    def tokenize_function(qa_data):
        inputs = tokenizer(
            qa_data['text'], 
            padding="max_length",  
            truncation=True,
            max_length=512,        
            return_tensors="pt"
        )

        return inputs

    return dataset.map(tokenize_function, batched=True)


dataset_file = "data/questions_answers.json"
model_path = "./Meta-Llama-3.1-8B"
dataset = load_and_flatten_dataset(dataset_file)
tokenizer = AutoTokenizer.from_pretrained(model_path)

if tokenizer.pad_token is None:
    tokenizer.pad_token_id = tokenizer.eos_token_id
    tokenizer.pad_token = tokenizer.eos_token 

tokenized_datasets = tokenize_dataset(dataset, tokenizer)
#dataset = load_dataset("imdb", split="train")

training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    per_device_train_batch_size=4,
    logging_dir='./logs',
    logging_steps=10,
    fp16=True
)

training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=5,
    per_device_train_batch_size=4,  
    logging_dir='./logs',
    logging_steps=10,
    gradient_accumulation_steps=4,  
    eval_strategy="epoch",
    save_strategy="epoch",
    fp16=True,  
    ddp_find_unused_parameters=False,  
    report_to="none",  
)

QLoRA = True
if QLoRA:
    quantization_config = BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_compute_dtype=torch.float16,
        bnb_4bit_quant_type="nf4"
    )
    
    model = AutoModelForCausalLM.from_pretrained(
        model_path,
        quantization_config=quantization_config,
        device_map="auto"  
    )

    lora_config = LoraConfig(
        r=8,
        target_modules="all-linear",
        bias="none",
        task_type="CAUSAL_LM",
    )
else:
    model = AutoModelForCausalLM.from_pretrained(model_path)
    lora_config = None


trainer = SFTTrainer(
    model=model,
    tokenizer=tokenizer,
    args=training_args,
    peft_config=lora_config,
    train_dataset=tokenized_datasets,
    eval_dataset=tokenized_datasets,
    dataset_text_field="text",
)

trainer.train()


output_dir="./fine-tuned-llama"

model.save_pretrained(output_dir)
tokenizer.save_pretrained(output_dir)

JSON file contains following sample data.

   {
        "context": "On 05 May 2019, Anne Johnson joined QandA Technologies. She was assigned Employee Number 1000 and the username gbuch. She was born on 12 May 1980. Her father's name is Peter Johnson and mother's name is Diana Johnson. She holds the position of Graphic Designer. For contact, her mobile number is 7344186426.",
        "questions": [
            {
                "question": "When did Anne Johnson join QandA Technologies?",
                "answer": "05 May 2019"
            },
            {
                "question": "What is Anne Johnson's employee number?",
                "answer": "1000"
            },
            {
                "question": "What is Anne Johnson's birthdate?",
                "answer": "12 May 1980"
            },
            {
                "question": "What is Anne Johnson's father's name?",
                "answer": "Peter Johnson"
            },
            {
                "question": "What is Anne Johnson's mother's name?",
                "answer": "Diana Johnson"
            },
            {
                "question": "What is Anne Johnson's job position?",
                "answer": "Graphic Designer"
            },
        ]
    },

Chahnwoo · September 24, 2024, 8:22am

There have been issues with the Llama 3.1 8B tokenizer, so I would check if the special tokens (namely the EOS token) have been correctly added to the tokenized dataset.

Anilissac · September 26, 2024, 5:36am

@Chahnwoo
I tried to add EOS token manually at the end of text line.

EOS_TOKEN = tokenizer.eos_token
"text": f"Context: {context}\nQuestion: {q['question']}\nAnswer: {q['answer']} + EOS_TOKEN"

But still it showed same result.

Chahnwoo · September 27, 2024, 7:30am

The formatted string you’ve shared here doens’t actually seem to add an eos_token to the end of a text. Instead, it seems to just be adding the raw text " + EOS_TOKEN". I would double check that first and foremost.

irisma00 · February 1, 2025, 2:07am

Hi, I encountered the same problem. Could you let me know how you solved it? Thanks!

---- update

I solved my problem by following the code here. You also need to save fine tuned model using trainer.save_model(output_dir) instead of model.save_pretrained() .

Topic		Replies	Views
Running LLaMA 3.1 8B Model Downloaded from Meta - Missing Configuration File Beginners	3	556	December 30, 2024
AutoModelForCausalLM() to HuggingFaceLLM Beginners	2	2960	October 4, 2024
Hugging Face to GGUF Conversion Broken? 🤗Hub	1	5263	February 11, 2024
How to use hugging face transformers for testing a dataset 🤗Transformers	1	266	May 4, 2024
meta-llama/Llama-3.2-11B-Vision-Instruct did not reply 🤗Transformers	10	12908	October 29, 2024

Finetuning Meta-Llama-3.1-8B using PEFT

Related topics