Early Stopping saving second best model, not first

Hi, i am new to Transformers so this might be a dumb question. I am training to fine tune a bert model for MCQ, these are my training arguments:

TrainingArguments(**
{"warmup_ratio": 0.1,
                   "lr_scheduler_type": "cosine",
                   "optim": "adafactor",
                   "gradient_accumulation_steps": 1,
                   "learning_rate": 5e-4,
                   "per_device_train_batch_size": 2,
                   "per_device_eval_batch_size": 2,
                   "num_train_epochs": 2,
                   "output_dir": ".",
                   "overwrite_output_dir": true,
                   "load_best_model_at_end": true,
                   "fp16": true,
                   "seed": 42,
                   "save_total_limit": 1,
                   "logging_steps": 100,      
                   "eval_steps": 100,
                   "evaluation_strategy": "steps",
                   "save_strategy": "steps",
                   "metric_for_best_model": "MAP@3",
                   "report_to": "wandb"}
)
Trainer(
    model=model,
    args=training_args,
    tokenizer=tokenizer,
    data_collator=DataCollatorForMultipleChoice(tokenizer=tokenizer),
    train_dataset=tokenized_train,
    eval_dataset=tokenized_test,
    compute_metrics=compute_metrics,
    callbacks = [EarlyStoppingCallback(early_stopping_patience=3)]
)

Where MAP@3 is a custom metric. What i noticed from my wandb logs is that the model loaded at the end has the second best MAP@3 on the test set, not first. This happened to me in 3 different runs. Any idea of why this is happening?
Thank you so much

1 Like