Hi, I am having an issue with PEFT model’s save_pretrained method where I am not able to change the base_model_name_or_path key in adapter_config json, once the model is saved with save_pretrained. I tried setting this same key in LoRA Config to whichever name I want but that’s not working correctly. This is an issue because when I try to load it again with from_pretrained, it sends out an error since it can’t find the correct base model, neither as a directory nor as a repo in Huggingface models.
For other info:
Model: Custom Phi-3 implementation, but also causes issue with normal Phi-3
OS: Linux(ubuntu)
Huggingface version: 4.43.3
PEFT version: 0.11.1
I think the following code should be able to reproduce the issue with a Phi-3 model:
from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training, AutoPeftModelForCausalLM, PeftModel, PeftConfig, PeftModelForCausalLM
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("/content/gdrive/My Drive/Colab Notebooks/DOE/src/models/weights/Phi-3-mini-128k-instruct")
peft_config = LoraConfig(**{
"r": 16,
"lora_alpha": 32,
"lora_dropout": 0.05,
"bias": "none",
"task_type": "CAUSAL_LM",
"target_modules": "all-linear",
"modules_to_save": None,
"base_model_name_or_path": "/content/gdrive/My Drive/Colab Notebooks/DOE/src/models/weights/Phi-3ex"
})
kbit_model = prepare_model_for_kbit_training(model)
peft_model = get_peft_model(kbit_model, peft_config)
peft_model.save_pretrained("/content/gdrive/MyDrive/Colab Notebooks/DOE/test_peft_model/phi-peft-v3")
The expected base_model_name_or_path key in adapter_config should be /content/gdrive/My Drive/Colab Notebooks/DOE/src/models/weights/Phi-3ex, but instead adapter_config,json will contain following contents:
{
"alpha_pattern": {},
"auto_mapping": null,
"base_model_name_or_path": "/content/gdrive/My Drive/Colab Notebooks/DOE/src/models/weights/Phi-3-mini-128k-instruct",
"bias": "none",
"fan_in_fan_out": false,
"inference_mode": true,
"init_lora_weights": true,
"layer_replication": null,
"layers_pattern": null,
"layers_to_transform": null,
"loftq_config": {},
"lora_alpha": 32,
"lora_dropout": 0.05,
"megatron_config": null,
"megatron_core": "megatron.core",
"modules_to_save": null,
"peft_type": "LORA",
"r": 16,
"rank_pattern": {},
"revision": null,
"target_modules": [
"gate_up_proj",
"o_proj",
"qkv_proj",
"down_proj"
],
"task_type": "CAUSAL_LM",
"use_dora": false,
"use_rslora": false
}
I am not that well versed in this so I may have missed something, but currently its not working for me. I appreciate any help.