Config.json is not saving after finetuning Llama 2

After finetuning, i’m not able to save a config.json file using trainer.model.save_pretrained

My pip install:

!pip install torch datasets
!pip install -q accelerate==0.21.0 peft==0.4.0 bitsandbytes==0.40.2 transformers==4.31.0 trl==0.4.7

Model code:

model_name='meta-llama/Llama-2-7b-chat-hf'

model_config = transformers.AutoConfig.from_pretrained(
    model_name,
)

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"


bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type='nf4',
    bnb_4bit_compute_dtype=getattr(torch,"float16"),
    bnb_4bit_use_double_quant=False,
)

model = AutoModelForCausalLM.from_pretrained(
   model_name,
    quantization_config=bnb_config,
)

huggingface_dataset_name = "mlabonne/guanaco-llama2-1k"

#dataset = load_dataset(huggingface_dataset_name, "pqa_labeled", split = "train")

dataset = load_dataset(huggingface_dataset_name, split="train")

# LoRA attention dimension
lora_r = 64

# Alpha parameter for LoRA scaling
lora_alpha = 16

# Dropout probability for LoRA layers
lora_dropout = 0.1

# Load LoRA configuration
peft_config = LoraConfig(
    lora_alpha=lora_alpha,
    lora_dropout=lora_dropout,
    r=lora_r,
    bias="none",
    task_type="CAUSAL_LM",
)

training_arguments = TrainingArguments(
    output_dir="random_weights",
    fp16=True,
    learning_rate=1e-5,
    num_train_epochs=5,
    weight_decay=0.01,
    logging_steps=1,
    max_steps=1)

trainer = SFTTrainer(
    model=model,
    train_dataset=dataset,
    peft_config=peft_config,
    dataset_text_field="text",
    max_seq_length=None,
    tokenizer=tokenizer,
    args=training_arguments
)

# Train model
trainer.train()

trainer.model.save_pretrained("finetuned_llama")

Now, when I go to load the model it complaing about not having a config.json. I’ve also tried explicitly saving the config using trainer.model.config.save_pretrained but I had no luck.

I might be missing something obvious. Any ideas on what might be going wrong?

2 Likes

Did you fix that?

I think it is because you are using LoRA which trains an adapter model. Try calling model.merge_and_unload() to merge the adapter model with your base model before saving.