So I think I found a solution to this, but if anyone has more info on this topic please lmk! After the first training epochs, save the fine-tuned model. Then re-load the base model in some variable, then use the merge_and_unload() command to merge the fine-tuned model and the base-model. Then save the merged model. The saved merged model will be the size of the base model, with the fine-tuned layers incorporated. To further fine-tune that model, load the merged model and pass fine-tune that merged model as if it were the base model.
python
# Save fine-tuned model
trainer_filepath= f"trainer/llama7b/{train_util.get_time()}"
trainer.model.save_pretrained(trainer_filepath)
# reload base model
base_model= AutoModelForCausalLM.from_pretrained(model_name, token= huggingface_token)
# merge base model and fine-tuned model
merged_model= PeftModel.from_pretrained(base_model, trainer_filepath)
merged_model= merged_model.merge_and_unload()
# save merged model
merged_model_path= f"model/llama7b/merged_{train_util.get_time()}"
merged_model.save_pretrained(merged_model_path)
If there’s a better solution to this problem, please lmk!