Early stopping training using Validation loss as the metric for best model

Hi,

I am trying to fine tune a pegasus/bigbird model on a custom dataset and have discovered that the model is prone to overfitting after a few epochs. I am trying to use an early stopping callback to stop training as soon as validation loss increases. I’m unable to find the documentation for the name to use for the validation loss metric.

I have tried the below trainer arguments:

training_args = Seq2SeqTrainingArguments(
    output_dir="./bigbird_tldr_acl_trained",
    evaluation_strategy="epoch",
    save_strategy="epoch",
    per_device_train_batch_size=1,
    per_device_eval_batch_size=1,
    save_total_limit=1,
    num_train_epochs=5,
    predict_with_generate=True,
    fp16=True,
    push_to_hub=False,
    gradient_accumulation_steps=4,
    gradient_checkpointing=True,
    optim="adamw_torch",
    learning_rate=5e-5,
    weight_decay=0.01,
    report_to="none",
    metric_for_best_model='Validation Loss',
    load_best_model_at_end=True,
    greater_is_better=False
)

but am receiving the following error:

early stopping required metric_for_best_model, but did not find eval_Validation Loss so early stopping is disable

KeyError: 'eval_Validation Loss'

Can someone please point me towards the metric name I should use for this to work?

I found that the correct parameter is “eval_loss” for the validation loss

2 Likes