It is not clear to me what is the correct way to save/load a PEFT checkpoint, as well as the final fine-tuned model. There have been reports of trainer.resume_from_checkpoint not working as expected [1][2][3], each of which have very few replies, or do not seem to have any sort of consensus. Proposed solutions range from trainer.save_model, to trainer.save_state to resume_from_checkpoint=True to model.save_pretrained(PEFT docs) to even a very complicated procedure of merging and saving the model [4].
It is very confusing trying to figure out the correct solution between these, especially if resume_from_checkpoint can be buggy. Loading/saving models should really not be this confusing, so can we resolve once and for all what is the officially recommended (+tested) way of saving/loading adapters, as well as individual checkpoints during training? Can we update the HF docs accordingly, and simplify this process?
Hi @remorax98
unfortunately sylvian is no longer a part of (just wanted to clarify this since there are lots of people who kept tagging him)
also for loading your PEFT model for continuous training there is a very easy parameter around this itâs called is_trainable which should allow you to load your PEFT model in a trainable state, then you can continue your training easily. how to use
for my repo in not-lain/Gemma-2b-Peft-finetuning all i have to do is
# most of this code is from the button at the top right corner on đ¤
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM
config = PeftConfig.from_pretrained("not-lain/Gemma-2b-Peft-finetuning")
model = AutoModelForCausalLM.from_pretrained("google/gemma-2b")
model = PeftModel.from_pretrained(model,
"not-lain/Gemma-2b-Peft-finetuning",
is_trainable=True # đ here
)
# check if it's working
model.print_trainable_parameters()
# >>> trainable params: 9,805,824 || all params: 2,515,978,240 || trainable%: 0.3897420034920493
heart react this comment if this solved the problem for you
Hi @not-lain thanks for letting me know about Sylvain! Sorry, my bad.
Regarding this parameter, this is good to know and quite useful for me. But it does not answer my specific question - I am more concerned with the proper way for saving adapters and checkpoints in HF, and the lack of clarity in documentation regarding the same.
donât mention it @remorax98.
I also made this notebook for you explaining all steps from the initial training then recalling the model and continue training, it took me a lot of time to get it done, but hope this helps you out.
if this notebook helped you clarify how to use PEFT, please consider marking this conversation as solved
Guys can you help with proper examples of storing it to the file system instead of pushing to service? I tried trainer.model.save_pretrained("shake_adapter")
and then
@adiudiun, I had the same problem. The correct way is to first load the base_model using AutoModel.from_pretrained(). Then, load the adapter config using PeftConfig.from_pretrained('saved_dire'). Finally, load the peft_model as follows:
from peft import PeftModel, PeftConfig
from unsloth import FastLanguageModel
import torch
max_seq_length = 4096 # Can increase for longer reasoning traces
lora_rank = 32 # Larger rank = smarter, but slower
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = âunsloth/Qwen3-4B-Instruct-2507â,
max_seq_length = max_seq_length,
load_in_4bit = True, # False for LoRA 16bit #fast_inference = Tr, # Enable vLLM fast inference #max_lora_rank = lora_rank, #gpu_memory_utilization = 0.7, # Reduce if out of memory
)
model = PeftModel.from_pretrained(model,
â/kaggle/input/qwen3-4b-instruct-lora/Qwen3_(4B)-Instruct_lora_modelâ,
is_trainable=True # here
)
âŚ
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn