I am fine-tuning a BERT model for a token classification task. My training arguments are as follows:
args = TrainingArguments( output_dir="bert-finetuned-ner", evaluation_strategy="epoch", save_strategy="epoch", save_total_limit=5, metric_for_best_model="f1", greater_is_better=True, load_best_model_at_end=True, learning_rate=2e-5, num_train_epochs=50, weight_decay=0.01, logging_steps=10, logging_strategy="epoch", push_to_hub=True )
I am using the
Trainer class to train the model:
trainer = Trainer( model=model, args=args, train_dataset=tokenized_datasets["train"], eval_dataset=tokenized_datasets["test"], data_collator=data_collator, compute_metrics=compute_metrics, tokenizer=tokenizer, callbacks=[EarlyStoppingCallback(early_stopping_patience=3)], )
I then train the model and push it to the Hub using the following code:
The model was trained for 10 epochs and then stopped due to the
EarlyStoppingCallback. The model achieved its highest F1 score on the 7th epoch, but the auto-generated model card on Hugging Face Hub displays the metrics from the final (10th) epoch, which has a lower F1 score.
This leaves me uncertain as to whether the model checkpoints that were pushed to the Hub are from the 7th epoch or the 10th epoch, as I had specified
load_best_model_at_end=True in my
TrainingArguments and expected the model card to reflect the metrics from the 7th epoch.
Is there a way to determine which epoch’s checkpoints were used in the model that was pushed to the Hub? I have looked through the HF Transformers and Hub documentation, but have not found any information related to this.