Trainer is not saving all layers when fine-tuning Llama with P-Tuning

I’m fine-tuning LLaMA-3-8B-Instruct with P-Tuning, but I’ve noticed an issue: the mlp_head isn’t being saved in the checkpoint. Only the embedding is saved.

When I prepare the model for fine-tuning, the architecture includes the following prompt_encoder module:

(prompt_encoder): ModuleDict(
    (default): PromptEncoder(
        (embedding): Embedding(200, 4096)
        (mlp_head): Sequential(
            (0): Linear(in_features=4096, out_features=1024, bias=True)
            (1): ReLU()
            (2): Linear(in_features=1024, out_features=1024, bias=True)
            (3): ReLU()
            (4): Linear(in_features=1024, out_features=4096, bias=True)
        )
    )
)

However, after training, when I load the checkpoint, the mlp_head is missing. I only get this:

(prompt_encoder): ModuleDict(
    (default): PromptEncoder(
        (embedding): Embedding(200, 4096)
    )
)

I’m not sure why this is happening. I’m using the Training class and the get_peft_model function to set up the model for fine-tuning. To load the checkpoint, I use PeftModel.from_pretrained. Am I missing something here?

1 Like