Now, when I go to load the model it complaing about not having a config.json. Iβve also tried explicitly saving the config using trainer.model.config.save_pretrained but I had no luck.
I might be missing something obvious. Any ideas on what might be going wrong?
I think it is because you are using LoRA which trains an adapter model. Try calling model.merge_and_unload() to merge the adapter model with your base model before saving.
If you train a model with LoRa (low-rank adaptation), you only train adapters on top of the base model. E.g. if you fine-tune LLaMa with LoRa, you only add a couple of linear layers (so-called adapters) on top of the original (also called base) model. Hence calling save_pretrained() or push_to_hub() will only save 2 things:
the adapter configuration (in an adapter_config.json file)
the adapter weights (typically in a safetensors file).
In order to merge these adapter layers back into the base model, one can call the merge_and_unload method. Afterwards, you can call save_pretrained() on it which will save both the weights and the configuration in a config.json file:
from transformers import AutoModelForCausalLM
from peft import PeftModel
model = AutoModelForCausalLM.from_pretrained(base_model_name)
model = PeftModel.from_pretrained(model, adapter_model_name)
model = model.merge_and_unload()
model.save_pretrained("my_model")
One feature of the Transformers library is that it has PEFT integration, which means that you can call from_pretrained() directly on a folder/repository that only contains this adapter_config.json file and the adapter weights, and it will automatically load the weights of the base model + adapters. See PEFT integrations. Hence we could also just have done this:
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained(folder_containing_only_adapter_weights)
model.save_pretrained("my_model")
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained(folder_containing_only_adapter_weights)
model = model.merge_and_unload()
model.save_pretrained("my_model")
and I got this error . AttributeError: 'LlamaForCausalLM' object has no attribute 'merge_and_unload'
It looks like one can only call the merge_and_unload() method on the PeftModel class available in the PEFT library, not on a class in Transformers. So the only way to merge adapters into the base model is using this:
model = AutoModelForCausalLM.from_pretrained(base_model_name)
model = PeftModel.from_pretrained(model, adapter_model_name)
model = model.merge_and_unload()
model.save_pretrained("merged_adapters")