Using the specific loss of a dataset as the early stopping metric

dduka · March 13, 2024, 1:07pm

Hi everyone,

I’m trying to fine-tune an XGLM model from Huggingface for the quy_Latn language. Currently, I have one training dataset and several datasets that I would like to evaluate my model on.

The thing is that I want to have EarlyStoppingCallback in my Trainer instance and I want to specify the loss that the early stopping should use. For example, if I pass two datasets to the Trainer, named eng_Latn and quy_Latn, I would like to use the second one for determining the best model.

Here is some code from my script:

training_args = TrainingArguments(
        output_dir=f"{dir_path}/checkpoints",
        logging_dir=f"{dir_path}/model_logs",
        save_strategy="steps",
        evaluation_strategy="steps",
        save_steps=1,  # Save every 1 steps
        eval_steps=1,  # Evaluate every 1 steps
        save_total_limit=1,  # Only keep one checkpoint
        per_device_train_batch_size=PER_DEVICE_TRAIN_BATCH_SIZE,
        per_device_eval_batch_size=PER_DEVICE_EVAL_BATCH_SIZE,
        num_train_epochs=EPOCHS,
        remove_unused_columns=False,
        report_to="all",
        logging_steps=10,
        fp16=True,
        greater_is_better=False,
        metric_for_best_model="quy_Latn_loss",
        load_best_model_at_end=True,
        save_safetensors=False,
        prediction_loss_only=True,
    )

    trainer = Trainer(
        model=model,
        args=training_args,
        train_dataset=train_dataset,
        eval_dataset=eval_datasets,
        data_collator=lambda data: collate_fn(data, tokenizer),
        callbacks=[EarlyStoppingCallback(early_stopping_patience=2)]
    )

The problem is that I get an error saying that eval_quy_Latn_loss metric does not exist, even though I’m able to see that loss on the logs. Does anyone know what the problem is in this case?

Thanks in advance.

Topic		Replies	Views
Evaluating your model on more than one dataset Beginners	3	2055	February 28, 2022
Early stopping callback problem Beginners	2	8291	April 22, 2021
Evaluate Model on Test dataset (PPL) Beginners	3	1470	June 10, 2021
Why i can't use EarlyStoppingCallback and load_best_model_at_end=False 🤗Transformers	0	704	August 8, 2023
2 evalset but got no validation loss Beginners	4	289	April 17, 2024

Using the specific loss of a dataset as the early stopping metric

Related topics