Unexpected behavior of load_best_model_at_end in Trainer (or am I doing it wrong?)

fabikru · March 25, 2025, 12:50pm

For me the trainer doesn’t load the best model in the end but the latest instead. I set load_best_model_at_end=True and also tried specifiying metric_for_best_model="eval_loss" and greater_is_better=False. Anybody experiencing the same? I assume it’s the newest instead of the the best model by running trainer.evaluate() after training and seeing that it’s not the lowest eval_loss. I am using the newest transformers version. Thank you for your help!

This is my code:

    trainer = Trainer(model=model,
                      args=training_args,
                      data_collator=data_collator,
                      train_dataset=tokenized_dataset["train"],
                      eval_dataset=tokenized_dataset["test"],
                      compute_metrics=compute_metrics,
                      callbacks=[early_stopping_callback, csv_logger_callback],
                      preprocess_logits_for_metrics=preprocess_logits_for_metrics)

    trainer.train()
    eval_results = trainer.evaluate()
    logging.info("Final evaluation results on validation set are:\n" + json.dumps(eval_results, indent=2))

And this is my training_args:

training_arguments:
load_best_model_at_end: True
metric_for_best_model: “eval_loss”
greater_is_better: False
max_steps: 100000
per_device_train_batch_size: 2048
per_device_eval_batch_size: 2048
optim: “schedule_free_adamw”
lr_scheduler_type: “constant”
learning_rate: 0.001
weight_decay: 0.00001
fp16: True
eval_strategy: “steps”
save_strategy: “steps”
eval_steps: 500
save_steps: 500
dataloader_num_workers: 32
dataloader_pin_memory: True
warmup_steps: 1000
tf32: True
torch_compile: True
torch_compile_backend: “inductor’”
eval_on_start: True
eval_accumulation_steps: 8
save_total_limit: 2
gradient_accumulation_steps: 1

fabikru · March 25, 2025, 2:04pm

Never mind, the issue was simply that I didn’t employ a deterministic evaluation loop (because of random masking). Consequently, it selects the best model, but I don’t necessarily obtain the lowest loss when calling trainer.evaluate().

system · March 26, 2025, 2:05am

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Load_best_model_at_end doesn't work? 🤗Transformers	1	98	March 25, 2025
What does load_best_model_at_end=True and evaluation_strategy="no" mean? Beginners	0	1283	July 29, 2023
Question Regarding trainer arguments:: load_best_model_at_end Beginners	2	1949	April 19, 2021
Trainer "load_best_model_at_end" doesn't load the best model Intermediate	0	2553	February 21, 2023
How to load the best model based on loss and eval_loss 🤗Transformers	0	1202	February 2, 2022

Unexpected behavior of load_best_model_at_end in Trainer (or am I doing it wrong?)

Related topics