Autogenerated model cards not showing the best metrics when using "load_best_model_at_end=True"

jtlicardo · December 24, 2022, 2:43pm

I am fine-tuning a BERT model for a token classification task. My training arguments are as follows:

args = TrainingArguments(
    output_dir="bert-finetuned-ner",
    evaluation_strategy="epoch",
    save_strategy="epoch",
    save_total_limit=5,
    metric_for_best_model="f1",
    greater_is_better=True,
    load_best_model_at_end=True,
    learning_rate=2e-5,
    num_train_epochs=50,
    weight_decay=0.01,
    logging_steps=10,
    logging_strategy="epoch",
    push_to_hub=True
)

I am using the Trainer class to train the model:

trainer = Trainer(
    model=model,
    args=args,
    train_dataset=tokenized_datasets["train"],
    eval_dataset=tokenized_datasets["test"],
    data_collator=data_collator,
    compute_metrics=compute_metrics,
    tokenizer=tokenizer,
    callbacks=[EarlyStoppingCallback(early_stopping_patience=3)],
)

I then train the model and push it to the Hub using the following code:

trainer.train()
trainer.push_to_hub()

The model was trained for 10 epochs and then stopped due to the EarlyStoppingCallback. The model achieved its highest F1 score on the 7th epoch, but the auto-generated model card on Hugging Face Hub displays the metrics from the final (10th) epoch, which has a lower F1 score.

This leaves me uncertain as to whether the model checkpoints that were pushed to the Hub are from the 7th epoch or the 10th epoch, as I had specified load_best_model_at_end=True in my TrainingArguments and expected the model card to reflect the metrics from the 7th epoch.

Is there a way to determine which epoch’s checkpoints were used in the model that was pushed to the Hub? I have looked through the HF Transformers and Hub documentation, but have not found any information related to this.

Topic		Replies	Views
Clarification on push_to_hub, best model, and model card 🤗Hub	3	1537	January 2, 2025
Early stopping + trainer + hub 🤗Transformers	3	3947	January 17, 2024
Early Stopping saving second best model, not first Beginners	0	436	August 16, 2023
How to load metrics in HF Trainer for the best model when `load_best_model_at_end=true`? 🤗Transformers	0	738	November 4, 2021
Unexpected behavior of load_best_model_at_end in Trainer (or am I doing it wrong?) 🤗Transformers	2	55	March 25, 2025

Autogenerated model cards not showing the best metrics when using "load_best_model_at_end=True"

Related topics