How to load the finetuned model (merged weights) on colab?

gaurav-imp · August 12, 2023, 7:15pm

I have finetuned the llama2 model. Reloaded the base model and merged the LoRA weights. I again saved this finally loaded model and now I intend to run it.


base_model = AutoModelForCausalLM.from_pretrained(
    model_name,
    low_cpu_mem_usage=True,
    return_dict=True,
    torch_dtype=torch.float16,
    device_map=device_map,
)
model = PeftModel.from_pretrained(base_model, new_model)
model = model.merge_and_unload()
model.save_pretrained(...path/to/model)

Now, I would like to the model at path/to/model using the following code

model_config = transformers.AutoConfig.from_pretrained(
    model_id,
    use_auth_token=hf_auth
)

model = transformers.AutoModelForCausalLM.from_pretrained(
    model_id,
    trust_remote_code=True,
    config=model_config,
    device_map='auto',
    offload_folder="offload",
    torch_dtype=float16,
    use_auth_token=hf_auth,
    offload_state_dict = True,
)
model.eval()

My intent behind saving the merged model is to eliminate the dependency on base_model.

problem

While running the model in the colab, i see there is no GPU usage and CPU is being used only. This crashes the runtime. I would like to know what is causing GPU to not being used?

huyen89 · November 27, 2023, 2:08am

Would you try saving only the adapter? When you need, you load the base model and the adapter, then merge, then use. I think saving the merged model raises a bug.

Topic		Replies	Views
Unable to Load Fine-Tuned Florence-2 Model Checkpoint from Colab on Local Device Models	2	152	January 18, 2025
Unable to load a FineTuned LLama Model to GPU for inference Beginners	3	2974	December 15, 2023
Help with merging LoRA to base model Beginners	1	39	April 23, 2025
Download and load fine-tuned model locally (VS Code) Beginners	3	4529	January 24, 2025
How do I merge a lora adapter back into the model weights? Beginners	1	3180	August 23, 2023

How to load the finetuned model (merged weights) on colab?

Related topics