What is the purpose of save_pretrained()?

Hi there! The question is a bit weird in the sense you are asking: “Why does the model have this method when the Trainer has that model?”. The base answer is: " because they are two different objects."

Not everyone uses the Trainer to train their model, so there needs to be a method directly on the model to properly save it.

Now inside the Trainer, you could very well never save any checkpoint (save_strategy="no") or have the last checkpoint saved be before the end of training (with a save_strategy="steps") so you won’t necessarily automatically have the last model saved inside a checkpoint.

A checkpoint, by the way, is just a folder with your model, tokenizer (if it was passed to the Trainer) and all necessary files to resume training from there (optimizer state, lr scheduler state, trainer state etc).

To save your model at the end of training, you should use trainer.save_model(optional_output_dir), which will behind the scenes call the save_pretrained of your model (optional_output_dir is optional and will default to the output_dir you set).

20 Likes