Early stopping training using Validation loss as the metric for best model

Chandula · February 9, 2023, 11:33am

Hi,

I am trying to fine tune a pegasus/bigbird model on a custom dataset and have discovered that the model is prone to overfitting after a few epochs. I am trying to use an early stopping callback to stop training as soon as validation loss increases. I’m unable to find the documentation for the name to use for the validation loss metric.

I have tried the below trainer arguments:

training_args = Seq2SeqTrainingArguments(
    output_dir="./bigbird_tldr_acl_trained",
    evaluation_strategy="epoch",
    save_strategy="epoch",
    per_device_train_batch_size=1,
    per_device_eval_batch_size=1,
    save_total_limit=1,
    num_train_epochs=5,
    predict_with_generate=True,
    fp16=True,
    push_to_hub=False,
    gradient_accumulation_steps=4,
    gradient_checkpointing=True,
    optim="adamw_torch",
    learning_rate=5e-5,
    weight_decay=0.01,
    report_to="none",
    metric_for_best_model='Validation Loss',
    load_best_model_at_end=True,
    greater_is_better=False
)

but am receiving the following error:

early stopping required metric_for_best_model, but did not find eval_Validation Loss so early stopping is disable

KeyError: 'eval_Validation Loss'

Can someone please point me towards the metric name I should use for this to work?

Chandula · February 9, 2023, 2:33pm

I found that the correct parameter is “eval_loss” for the validation loss

Topic		Replies	Views
Early stopping callback problem Beginners	2	8328	April 22, 2021
Problem with EarlyStoppingCallback 🤗Transformers	13	10632	April 4, 2024
Early Stopping with GPT from AutoModelForCausalLM 🤗Transformers	1	708	January 30, 2024
Using the specific loss of a dataset as the early stopping metric 🤗Transformers	0	236	March 13, 2024
Early_stopping_patience param in EarlyStoppingCallback 🤗Transformers	2	3174	April 15, 2024

Early stopping training using Validation loss as the metric for best model

Related topics