I’m fine-tuning LLaMA-3-8B-Instruct with P-Tuning, but I’ve noticed an issue: the mlp_head
isn’t being saved in the checkpoint. Only the embedding is saved.
When I prepare the model for fine-tuning, the architecture includes the following prompt_encoder
module:
(prompt_encoder): ModuleDict(
(default): PromptEncoder(
(embedding): Embedding(200, 4096)
(mlp_head): Sequential(
(0): Linear(in_features=4096, out_features=1024, bias=True)
(1): ReLU()
(2): Linear(in_features=1024, out_features=1024, bias=True)
(3): ReLU()
(4): Linear(in_features=1024, out_features=4096, bias=True)
)
)
)
However, after training, when I load the checkpoint, the mlp_head
is missing. I only get this:
(prompt_encoder): ModuleDict(
(default): PromptEncoder(
(embedding): Embedding(200, 4096)
)
)
I’m not sure why this is happening. I’m using the Training
class and the get_peft_model
function to set up the model for fine-tuning. To load the checkpoint, I use PeftModel.from_pretrained
. Am I missing something here?