After finetuning, iβm not able to save a config.json file using trainer.model.save_pretrained
My pip install:
!pip install torch datasets
!pip install -q accelerate==0.21.0 peft==0.4.0 bitsandbytes==0.40.2 transformers==4.31.0 trl==0.4.7
Model code:
model_name='meta-llama/Llama-2-7b-chat-hf'
model_config = transformers.AutoConfig.from_pretrained(
model_name,
)
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type='nf4',
bnb_4bit_compute_dtype=getattr(torch,"float16"),
bnb_4bit_use_double_quant=False,
)
model = AutoModelForCausalLM.from_pretrained(
model_name,
quantization_config=bnb_config,
)
huggingface_dataset_name = "mlabonne/guanaco-llama2-1k"
#dataset = load_dataset(huggingface_dataset_name, "pqa_labeled", split = "train")
dataset = load_dataset(huggingface_dataset_name, split="train")
# LoRA attention dimension
lora_r = 64
# Alpha parameter for LoRA scaling
lora_alpha = 16
# Dropout probability for LoRA layers
lora_dropout = 0.1
# Load LoRA configuration
peft_config = LoraConfig(
lora_alpha=lora_alpha,
lora_dropout=lora_dropout,
r=lora_r,
bias="none",
task_type="CAUSAL_LM",
)
training_arguments = TrainingArguments(
output_dir="random_weights",
fp16=True,
learning_rate=1e-5,
num_train_epochs=5,
weight_decay=0.01,
logging_steps=1,
max_steps=1)
trainer = SFTTrainer(
model=model,
train_dataset=dataset,
peft_config=peft_config,
dataset_text_field="text",
max_seq_length=None,
tokenizer=tokenizer,
args=training_arguments
)
# Train model
trainer.train()
trainer.model.save_pretrained("finetuned_llama")
Now, when I go to load the model it complaing about not having a config.json. Iβve also tried explicitly saving the config using trainer.model.config.save_pretrained
but I had no luck.
I might be missing something obvious. Any ideas on what might be going wrong?
2 Likes
I think it is because you are using LoRA which trains an adapter model. Try calling model.merge_and_unload()
to merge the adapter model with your base model before saving.
Same problem with me also.
Can you please elaborate?
If you train a model with LoRa (low-rank adaptation), you only train adapters on top of the base model. E.g. if you fine-tune LLaMa with LoRa, you only add a couple of linear layers (so-called adapters) on top of the original (also called base) model. Hence calling save_pretrained()
or push_to_hub()
will only save 2 things:
- the adapter configuration (in an
adapter_config.json
file)
- the adapter weights (typically in a safetensors file).
See here for example: ybelkada/opt-350m-lora at main. Here, OPT-350m is the base model.
In order to merge these adapter layers back into the base model, one can call the merge_and_unload method. Afterwards, you can call save_pretrained()
on it which will save both the weights and the configuration in a config.json file:
from transformers import AutoModelForCausalLM
from peft import PeftModel
model = AutoModelForCausalLM.from_pretrained(base_model_name)
model = PeftModel.from_pretrained(model, adapter_model_name)
model = model.merge_and_unload()
model.save_pretrained("my_model")
One feature of the Transformers library is that it has PEFT integration, which means that you can call from_pretrained()
directly on a folder/repository that only contains this adapter_config.json
file and the adapter weights, and it will automatically load the weights of the base model + adapters. See PEFT integrations. Hence we could also just have done this:
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained(folder_containing_only_adapter_weights)
model = model.merge_and_unload()
model.save_pretrained("my_model")
1 Like
I got the following error while running the upper code,
PS C:\Users\Anonymous\Desktop\AIRL Lab\MedLLM> python -u βc:\Users\Anonymous\Desktop\AIRL Lab\MedLLM\HF Connection (Temporary Use)\merging.pyβ
config.json: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 699/699 [00:00<?, ?B/s]
model.safetensors: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 2.20G/2.20G [14:18<00:00, 2.56MB/s]
generation_config.json: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 124/124 [00:00<?, ?B/s]
Traceback (most recent call last):
File βc:\Users\Anonymous\Desktop\AIRL Lab\MedLLM\HF Connection (Temporary Use)\merging.pyβ, line 8, in
model = PeftModel.from_pretrained(model, adapter_model_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File βC:\Users\Anonymous\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\peft\peft_model.pyβ, line 356, in from_pretrained
model.load_adapter(model_id, adapter_name, is_trainable=is_trainable, **kwargs)
File βC:\Users\Anonymous\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\peft\peft_model.pyβ, line 727, in load_adapter
adapters_weights = load_peft_weights(model_id, device=torch_device, **hf_hub_download_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File βC:\Users\Anonymous\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\peft\utils\save_and_load.pyβ, line 326, in load_peft_weights
adapters_weights = safe_load_file(filename, device=device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File βC:\Users\Anonymous\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\safetensors\torch.pyβ, line 311, in load_file
with safe_open(filename, framework=βptβ, device=device) as f:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
safetensors_rust.SafetensorError: Error while deserializing header: InvalidHeaderDeserialization
PS C:\Users\Anonymous\Desktop\AIRL Lab\MedLLM>