Loading and saving a model

jmeld · February 26, 2024, 7:24pm

I’m trying to fine-tune a model over several days because I have time limitations. So a few epochs one day, a few epochs the next, etc. However, every time I try to load the adapter config file resulting from the previous training session, the model that loads is the base model, as if no fine-tuning had occurred! I’m not sure what is happening. Does anyone have any advice on how to change this? Is it a result of my saving strategy and using patience?

My training arguments are as follows:

#Load local model
adapter_path= "./model"
model= AutoModelForCausalLM.from_pretrained(
    adapter_path,
    quantization_config= quantization_config,
    device_map={"": 0}, token= huggingface_token
)
model.config.use_cache = False
model.config.pretraining_tp = 1

training_params = TrainingArguments(
    evaluation_strategy= "epoch",
    save_strategy= "epoch",
    logging_strategy= "epoch",
    num_train_epochs=3,
    output_dir="./newresults",
    per_device_train_batch_size=1,
    gradient_accumulation_steps=80,
    optim="adamw_torch",
    learning_rate=1e-5,
    weight_decay=0.002,
    fp16=True,
    bf16=False,
    max_grad_norm=0.3,
    warmup_ratio=0.03,
    group_by_length=True,
    lr_scheduler_type="constant",
    report_to= "tensorboard",
    load_best_model_at_end= True,
    metric_for_best_model= "eval_loss",
    greater_is_better= False
)
trainer= SFTTrainer(
    model=model,
    data_collator=data_collator,
    train_dataset=new_dataset,
    peft_config=peft_params,
    dataset_text_field= "input",
    tokenizer= tokenizer,
    args=training_params,
    eval_dataset= valid_set,
    packing= False,
    callbacks= [callback]
)

trainer.train()
trainer.save_model("./model")
)```
This should save the model at every epoch in my local ./newresults directory, and it should save the final fine-tuned model in ./model. But when I try to load from either directory for the next round of training, the model that is loaded is the base model, not the fine-tuned one. Why might be the reason? Also, is there a way to distinguish which model is loaded before training? Right now I can tell because the re-loaded model's loss after an epoch of training is exactly what it was after the very first round of training from the base model.

Topic		Replies	Views
Correct way to save/load adapters and checkpoints in PEFT 🤗Transformers	10	15756	September 8, 2025
Retraining peft model Intermediate	3	2953	March 1, 2024
Proper way of saving/loading models for complex workflows 🤗Transformers	2	52	July 22, 2025
How to properly load the PEFT LoRA model 🤗Transformers	4	7286	April 13, 2025
Direct Load vs. Base Model + LoRA: How Should You Use It? Models	1	148	March 12, 2025

Loading and saving a model

Related topics