How to properly load the PEFT LoRA model

I used PEFT LoRA + Trainer to fine-tune a model.

I encountered an issue where the predictions of the fine-tuned model after training and the predictions after loading the model again are different.

I’d like to inquire about how to save the model in a way that allows consistent prediction results when the model is loaded.

Here’s my code. Thank you for your assistance.

# this code is load model and predict testset

config = PeftConfig.from_pretrained('./deberta_adapter')
model = AutoModelForMultipleChoice.from_pretrained(config.base_model_name_or_path, return_dict=True)
model = PeftModel.from_pretrained(model, './deberta_adapter', device_map="auto")

trainer = Trainer(
    model=model,
    data_collator=DataCollatorForMultipleChoice(tokenizer=tokenizer)
)

test_predictions = trainer.predict(tokenized_test_dataset).predictions
test_predictions[:4]

array([[-0.2088623 , -0.2109375 , -0.2088623 , -0.21069336, -0.20812988],
       [-0.20654297, -0.20532227, -0.20178223, -0.20324707, -0.20385742],
       [-0.20751953, -0.20983887, -0.20947266, -0.21105957, -0.20874023],
       [-0.21350098, -0.21508789, -0.21435547, -0.21533203, -0.2154541 ]],
      dtype=float32)
# this code just predict testset when model trainer train done
trainer = Trainer(
    model=model,
    args=training_args,
    tokenizer=tokenizer,
    data_collator=DataCollatorForMultipleChoice(tokenizer=tokenizer),
    train_dataset=train_tokenized_dataset,
    eval_dataset=test_tokenized_dataset,
    compute_metrics=compute_metrics,
)

trainer.train()

test_predictions = trainer.predict(tokenized_test_dataset).predictions
test_predictions[:4]

array([[ 0.51416016,  0.9223633 ,  0.91748047,  1.0332031 ,  0.10601807],
       [ 0.58447266,  0.70703125,  0.7138672 ,  0.8330078 ,  0.5136719 ],
       [ 1.2412109 ,  0.80078125,  1.28125   ,  0.15576172,  0.9501953 ],
       [-0.12115479, -0.7988281 , -0.75097656, -1.1914062 , -1.4853516 ]],
      dtype=float32)

additional, opreate as belows, I can get approximate prediction, but I can’t find any load peft model use this method is internet

torch.save(trainer.model.state_dict(),"model.pt")
model = AutoModelForMultipleChoice.from_pretrained(model_name)
peft_config = LoraConfig(
    r=16, lora_alpha=32, task_type=TaskType.SEQ_CLS, lora_dropout=0.1, 
   inference_mode=False
)
model = get_peft_model(model, peft_config)
model.load_state_dict(torch.load("model.pt"))

trainer = Trainer(
    model=model,
    data_collator=DataCollatorForMultipleChoice(tokenizer=tokenizer)

)
test_predictions = trainer.predict(tokenized_test_dataset).predictions
test_predictions[:4]

array([[ 0.51416016,  0.9223633 ,  0.9169922 ,  1.0332031 ,  0.10644531],
       [ 0.58447266,  0.70703125,  0.7138672 ,  0.83251953,  0.5131836 ],
       [ 1.2412109 ,  0.80078125,  1.28125   ,  0.15673828,  0.94970703],
       [-0.12109375, -0.7988281 , -0.7504883 , -1.1914062 , -1.4853516 ]],
      dtype=float32)