Trainer is not saving all layers when fine-tuning Llama with P-Tuning

oliveiraeliel · December 5, 2024, 6:53pm

I’m fine-tuning LLaMA-3-8B-Instruct with P-Tuning, but I’ve noticed an issue: the mlp_head isn’t being saved in the checkpoint. Only the embedding is saved.

When I prepare the model for fine-tuning, the architecture includes the following prompt_encoder module:

(prompt_encoder): ModuleDict(
    (default): PromptEncoder(
        (embedding): Embedding(200, 4096)
        (mlp_head): Sequential(
            (0): Linear(in_features=4096, out_features=1024, bias=True)
            (1): ReLU()
            (2): Linear(in_features=1024, out_features=1024, bias=True)
            (3): ReLU()
            (4): Linear(in_features=1024, out_features=4096, bias=True)
        )
    )
)

However, after training, when I load the checkpoint, the mlp_head is missing. I only get this:

(prompt_encoder): ModuleDict(
    (default): PromptEncoder(
        (embedding): Embedding(200, 4096)
    )
)

I’m not sure why this is happening. I’m using the Training class and the get_peft_model function to set up the model for fine-tuning. To load the checkpoint, I use PeftModel.from_pretrained. Am I missing something here?

Topic		Replies	Views
SFTTrainer checkpointing 🤗Transformers	6	6310	January 21, 2024
Fine tune a finetuned model Beginners	1	563	December 16, 2024
Addition of lm_head and embed_tokens layers to the lora adapter Beginners	3	552	February 16, 2025
Config.json is not saving after finetuning Llama 2 Beginners	9	3739	October 22, 2024
Different results from checkpoint evaluation when loading fine-tuned LLM model Intermediate	5	3236	September 22, 2023

Trainer is not saving all layers when fine-tuning Llama with P-Tuning

Related topics