Hi there,
I am quite confused about the early_stopping_patience in EarlyStoppingCallback.
Is it related to the evaluation_strategy in TrainingArgs?
For example, when the evaluation_strategy=‘epoch’ and early_stopping_patience=8 in TrainingArgs, the training will stop if the metrics/ loss does not improve/reduce after 8 epochs? And works the same when evaluation_strategy=‘steps’.
2 Likes
EarlyStoppingCallback is related with evaluation_strategy and metric_for_best_model.
-
early_stopping_patience (
int
) — Use withmetric_for_best_model
to stop training when the specified metric worsens forearly_stopping_patience
evaluation calls.
I was confused too whether to use it with evaluation_strategy=steps or epochs, but after some trials, I realized that it better to use it with epochs to grantee that model is trained on the whole dataset
If you use early_stopping_patience
in EarlyStoppingCallback
, must:
-
Pass
a function
make evaluaton dict tocompute_metrics
param inTrainer
class -
Use
metric_for_best_model
to set evaluation key in compute_metrics like: mae, mse… -
Use
greater_is_better
to set this evaluation object greater or lower is better. For mae or mse, lower is better.
My code:
**def compute_metrics(eval_pred):**
predictions, labels = eval_pred
predictions = predictions[:, 0]
mse = mean_squared_error(labels, predictions)
mae = mean_absolute_error(labels, predictions)
**return {"mse": mse, "mae": mae}**
training_args = TrainingArguments(
output_dir=f"{model_path.split('/')[-1]}_regression_finetuned_{output_name}",
evaluation_strategy="epoch",
save_strategy="epoch",
save_total_limit=2,
learning_rate=3e-5,
per_device_train_batch_size=16,
per_device_eval_batch_size=16,
num_train_epochs=10,
weight_decay=0.01,
load_best_model_at_end=True,
**metric_for_best_model="mae",**
**greater_is_better=False,**
warmup_steps=warmup_steps,
lr_scheduler_type='cosine',
logging_dir='./logs',
logging_steps=50,
push_to_hub=True,
run_name='run_cosine_decay_regression',
fp16=False,
report_to="none"
)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=tokenized_dataset['train'],
eval_dataset=tokenized_dataset['test'],
**compute_metrics=compute_metrics**,
tokenizer=tokenizer,
data_collator=data_collator,
**callbacks=[EarlyStoppingCallback(early_stopping_patience=2)]**
)