Early_stopping_patience param in EarlyStoppingCallback

Hi there,
I am quite confused about the early_stopping_patience in EarlyStoppingCallback.
Is it related to the evaluation_strategy in TrainingArgs?
For example, when the evaluation_strategy=‘epoch’ and early_stopping_patience=8 in TrainingArgs, the training will stop if the metrics/ loss does not improve/reduce after 8 epochs? And works the same when evaluation_strategy=‘steps’.

2 Likes

EarlyStoppingCallback is related with evaluation_strategy and metric_for_best_model.

  • early_stopping_patience ( int ) — Use with metric_for_best_model to stop training when the specified metric worsens for early_stopping_patience evaluation calls.

I was confused too whether to use it with evaluation_strategy=steps or epochs, but after some trials, I realized that it better to use it with epochs to grantee that model is trained on the whole dataset

If you use early_stopping_patience in EarlyStoppingCallback, must:

  1. Pass a function make evaluaton dict to compute_metrics param in Trainer class

  2. Use metric_for_best_model to set evaluation key in compute_metrics like: mae, mse…

  3. Use greater_is_better to set this evaluation object greater or lower is better. For mae or mse, lower is better.

My code:

**def compute_metrics(eval_pred):**
            predictions, labels = eval_pred
            predictions = predictions[:, 0]
            mse = mean_squared_error(labels, predictions)
            mae = mean_absolute_error(labels, predictions)
            **return {"mse": mse, "mae": mae}**

        training_args = TrainingArguments(
            output_dir=f"{model_path.split('/')[-1]}_regression_finetuned_{output_name}",
            evaluation_strategy="epoch",
            save_strategy="epoch",
            save_total_limit=2,
            learning_rate=3e-5,
            per_device_train_batch_size=16,
            per_device_eval_batch_size=16,
            num_train_epochs=10,
            weight_decay=0.01,
            load_best_model_at_end=True,
            **metric_for_best_model="mae",**
            **greater_is_better=False,**
            warmup_steps=warmup_steps,
            lr_scheduler_type='cosine',
            logging_dir='./logs',
            logging_steps=50,
            push_to_hub=True,
            run_name='run_cosine_decay_regression',
            fp16=False,
            report_to="none"
        )

        trainer = Trainer(
            model=model,
            args=training_args,
            train_dataset=tokenized_dataset['train'],
            eval_dataset=tokenized_dataset['test'],
            **compute_metrics=compute_metrics**,
            tokenizer=tokenizer,
            data_collator=data_collator,
            **callbacks=[EarlyStoppingCallback(early_stopping_patience=2)]**
        )