What is the purpose of save_pretrained()?

sgugger · August 12, 2021, 6:25am

Hi there! The question is a bit weird in the sense you are asking: “Why does the model have this method when the Trainer has that model?”. The base answer is: " because they are two different objects."

Not everyone uses the Trainer to train their model, so there needs to be a method directly on the model to properly save it.

Now inside the Trainer, you could very well never save any checkpoint (save_strategy="no") or have the last checkpoint saved be before the end of training (with a save_strategy="steps") so you won’t necessarily automatically have the last model saved inside a checkpoint.

A checkpoint, by the way, is just a folder with your model, tokenizer (if it was passed to the Trainer) and all necessary files to resume training from there (optimizer state, lr scheduler state, trainer state etc).

To save your model at the end of training, you should use trainer.save_model(optional_output_dir), which will behind the scenes call the save_pretrained of your model (optional_output_dir is optional and will default to the output_dir you set).

Topic		Replies	Views
If I use trainer.train() and then save the model, is that still useful? Beginners	4	2768	June 24, 2022
Trainer's `save_model` isn't saving the entire state_dict and is only saving the embedding/encoder Beginners	1	1502	January 2, 2024
Checkpoint vs model weight Beginners	2	4786	October 12, 2020
Continuing Pre Training from Model Checkpoint Models	12	42100	January 13, 2025
Save only best model in Trainer 🤗Transformers	31	85195	June 25, 2024

What is the purpose of save_pretrained()?

Related topics