Problem with EarlyStoppingCallback

Elidor00 · January 26, 2021, 11:42am

I set the early stopping callback in my trainer as follows:

trainer = MyTrainer(
        model=model,
        args=training_args,
        train_dataset=train_dataset,
        eval_dataset=eval_dataset,
        compute_metrics=compute_metrics,
        callbacks=[EarlyStoppingCallback(3, 0.0)]
    )

the values for this callback in the TrainingArguments are as follows:

load_best_model_at_end=True, 
metric_for_best_model=eval_loss, 
greater_is_better=False

What I expect is that the training will continue as long as the eval_loss metric continues to drop. While the training will stop only when the eval_loss has not dropped for more than 3 epochs and the best model will be loaded.
During the training I get these values for the eval_loss:

epoch1: 'eval_loss': 0.8832499384880066
epoch2: 'eval_loss': 0.6109879612922668
epoch3: 'eval_loss': 0.52149897813797
epoch4: 'eval_loss': 0.48024266958236694

therefore, as it always drops, I would expect the training to continue. Instead the training stopped after 4 epochs and during the evaluation it uploaded the model related to the first epoch, where the eval_loss had the greatest value, as you can see in the following:

01/26/2021 11:08:57 - INFO - __main__ -  ***** Eval results *****
01/26/2021 11:08:57 - INFO - __main__ -    eval_loss = 0.8832499384880066

Am I wrong to set some parameters?
Thanks!

EDIT: to clarify, I also printed the TrainerState values at the end of the training:

log_history=[
{'eval_loss': 0.837020993232727, 'eval_accuracy_score': 0.8039973127309372, 'eval_precision': 0.7904381747255738, 'eval_recall': 0.7808047316067748, 'eval_f1': 0.7855919213776935, 'eval_runtime': 8.375, 'eval_samples_per_second': 67.343, 'epoch': 1.0, 'step': 411}, {'loss': 1.5377, 'learning_rate': 4.6958980235865466e-05, 'epoch': 1.22, 'step': 500}, 
{'eval_loss': 0.6051444411277771, 'eval_accuracy_score': 0.8406953308700034, 'eval_precision': 0.8297104717236403, 'eval_recall': 0.8243570212384622, 'eval_f1': 0.8270250831610176, 'eval_runtime': 8.3919, 'eval_samples_per_second': 67.208, 'epoch': 2.0, 'step': 822}, {'loss': 0.6285, 'learning_rate': 4.3917595505563304e-05, 'epoch': 2.43, 'step': 1000}, 
{'eval_loss': 0.5184187889099121, 'eval_accuracy_score': 0.856567013772254, 'eval_precision': 0.8464932024849194, 'eval_recall': 0.8425486154673358, 'eval_f1': 0.8445163028833199, 'eval_runtime': 8.4159, 'eval_samples_per_second': 67.016, 'epoch': 3.0, 'step': 1233}, {'loss': 0.4561, 'learning_rate': 4.087621077526113e-05, 'epoch': 3.65, 'step': 1500}, 
{'eval_loss': 0.46523478627204895, 'eval_accuracy_score': 0.868743701713134, 'eval_precision': 0.8599369085173502, 'eval_recall': 0.8550049287570571, 'eval_f1': 0.8574638267277793, 'eval_runtime': 8.3682, 'eval_samples_per_second': 67.398, 'epoch': 4.0, 'step': 1644}, {'train_runtime': 1783.4323, 'train_samples_per_second': 4.609, 'epoch': 4.0, 'step': 1644}
], 
best_metric=0.837020993232727

as you can also see from here, the best_metric is the value of the val_loss of the first epoch and not the lowest among the epochs it has done (which are still few because the value is always decreasing and therefore the training should not even stop …).

Topic		Replies	Views
Early stopping callback problem Beginners	2	8369	April 22, 2021
Early_stopping_patience param in EarlyStoppingCallback 🤗Transformers	2	3344	April 15, 2024
[Maybe Bug] When using EarlyStopping Callbacks with Seq2SeqTraininer, training didn't stop DeepSpeed	3	1557	April 4, 2024
Early Stopping with GPT from AutoModelForCausalLM 🤗Transformers	1	718	January 30, 2024
Early Stopping saving second best model, not first Beginners	0	437	August 16, 2023

Problem with EarlyStoppingCallback

Related topics