Save only best model in Trainer

KonstantinaL · September 18, 2023, 2:47pm

Hey cramraj8, I think that if you use the following in the training config

save_total_limits=2
save_strategy=”no”

then the best and the latest models will be saved. You can compare the checkpoint number of these two models and infer which one is the largest number to get the latest iteration essentially.

Alternatively, if you use load_best_model=True in the config as well, and then do trainer.state.best_model_checkpoint after training, you can get the best checkpoint number, and again from that you can infer that the other output directory contains the latest model.

This is not exact, but if you use save_strategy=steps and save_steps=NUMBER, it seems that the total number of steps done during training is approximately the number of steps defined in save_steps multiplied with the batch size defined in per_device_train_batch_size.

Topic		Replies	Views
Checkpoints and disk storage 🤗Transformers	15	8049	June 2, 2024
Question Regarding trainer arguments:: load_best_model_at_end Beginners	2	1949	April 19, 2021
Saving only the best performing checkpoint 🤗Transformers	19	18209	May 23, 2023
Behaviour change in checkpoints saved by Trainer 🤗Transformers	0	961	July 17, 2023
Disable checkpointing in Trainer 🤗Transformers	4	7796	January 10, 2022

Save only best model in Trainer

Related topics