I am finetuning Llama for binary sequence classification with PEFT & Lora using the Trainer class. Loss seems to decrease nicely and accuracy on the validation data reaches ~90% in the final epoch: { "epoch": 4.98, "eval_accuracy": 0.9346576058546785, "eval_loss": 0.18449442088…

Different results from checkpoint evaluation when loading fine-tuned LLM model

botkop September 22, 2023, 8:24am 6

probably related:

Topic		Replies	Views
Llama-2 Sequence Classification: Much lower accuracy on inference from checkpoint compared to model 🤗Transformers	5	6001	February 20, 2024
Load model from checkpoints occurs degraded performance Beginners	2	828	July 7, 2023
Inference, checkpoint Beginners	0	887	December 5, 2023
How to properly load the PEFT LoRA model 🤗Transformers	4	7421	April 13, 2025
Saving the adapter_model.bin from checkpoint pytorch_model.bin 🤗Transformers	0	844	June 15, 2023