Further finetuning a LoRA finetuned CausalLM Model

Hey everyone,

I am a bit unsure how to proceed regarding the mentioned topic.

The baseline is a model created via Huggingface’s library as an AutoModelForCausalLM model, PEFT and a LoRA approach with subsequent merging of the weights.

I now want to further fine tune the model without losing its original properties - in this case via instruction fine tuning or prefix tuning.

My approach would be the following:

 model = AutoModelForCausalLM.from_pretrained(
        model_id,
        use_cache=False if gradient_checkpointing else True
        device_map="auto",
        load_in_8bit=True,
    )

model = create_peft_config(model)

output_dir = "/tmp"
training_args = TrainingArguments(
        output_dir=output_dir,
        overwrite_output_dir=True,
        per_device_train_batch_size=per_device_train_batch_size,
        per_device_eval_batch_size=per_device_train_batch_size,
        bf16=bf16,
        learning_rate=lr,
        num_train_epochs=epochs,
        gradient_checkpointing=gradient_checkpointing,
        gradient_accumulation_steps=2,
        logging_dir=f"{output_dir}/logs",
        logging_strategy="steps",
        logging_steps=10,
        optim="adafactor",
        save_strategy="epoch",
        save_total_limit=3,
        evaluation_strategy="epoch",
        load_best_model_at_end=False,
        no_cuda=False,
        auto_find_batch_size=True
)

trainer = Trainer(
        model=model,
        args=training_args,
        train_dataset=dataset_train,
        compute_metrics=compute_metrics,
        preprocess_logits_for_metrics=preprocess_logits_for_metrics,
        eval_dataset=dataset_eval,
        data_collator=default_data_collator
)

trainer.train()

trainer.model.save_pretrained(output_dir)

del model
del trainer

peft_config = PeftConfig.from_pretrained(output_dir)
model = AutoModelForCausalLM.from_pretrained(
        peft_config.base_model_name_or_path,
        load_in_8bit=False,
        return_dict=True,
        device_map="auto",
        torch_dtype=torch.float16,
        low_cpu_mem_usage=True,
)
model = PeftModel.from_pretrained(
        model,
        output_dir,
        torch_dtype=torch.float16,
        device_map="auto",
)
model.eval()
os.makedirs("lora", exist_ok=True)

merged_model = model.merge_and_unload()
merged_model.save_pretrained('lora')

tokenizer = AutoTokenizer.from_pretrained(model_id)
tokenizer.save_pretrained('lora')

In principle, I am loading the original model with the merged weights, finetune that on new data likewise with PEFT and LoRA and afterwards merging the weights again into the base model.

Is this a sensible approach, or is there something to suggest, for example, that I might even significantly compromise the original capabilities by doing so? If something speaks against it, what would be a better approach?

Kind regards and thanks in advance