What is the purpose of save_pretrained()?

ThomasG · August 11, 2021, 8:14pm

Hello everyone. I hope my question is not too silly, but there is something that confuses me.

Let’s say I load a huggingface model using from_pretrained() method, and then finetune it using the Trainer class. Now, via TrainingArguments, I get the chance to define an argument called output_dir. If I specify a directory here, won’t my model be saved in this directory, thus enabling me to load it again in the future from this folder, using from_pretrained()?

Here lies my question: if this argument lets me save the model, what is the purpose of save_pretrained()?

Looking on their respective documentations, it is clear that they do something different however: the first one saves something called checkpoints, while the second one saves “the model and its configuration”.

Can someone explain to me their difference, in other words, what is the difference between saving the checkpoints or the model? Won’t the last checkpoint be the same as the model and its weights?

Thanks in advance.

sgugger · August 12, 2021, 6:25am

Hi there! The question is a bit weird in the sense you are asking: “Why does the model have this method when the Trainer has that model?”. The base answer is: " because they are two different objects."

Not everyone uses the Trainer to train their model, so there needs to be a method directly on the model to properly save it.

Now inside the Trainer, you could very well never save any checkpoint (save_strategy="no") or have the last checkpoint saved be before the end of training (with a save_strategy="steps") so you won’t necessarily automatically have the last model saved inside a checkpoint.

A checkpoint, by the way, is just a folder with your model, tokenizer (if it was passed to the Trainer) and all necessary files to resume training from there (optimizer state, lr scheduler state, trainer state etc).

To save your model at the end of training, you should use trainer.save_model(optional_output_dir), which will behind the scenes call the save_pretrained of your model (optional_output_dir is optional and will default to the output_dir you set).

ThomasG · August 12, 2021, 9:57am

Hello. Thank you very much for the detailed answer!

By the way, if I create a model class that inherits from torch.nn.Module and slightly alter a huggingface pretrained model (e.g. adding a different classification head), then train it using native pytorch, I should use torch.save() instead right?

sgugger · August 12, 2021, 10:18am

You should subclass PreTrainedModel if your model is very similar to a Transformers model, to be able to retain the full functionality.

Otherwise yes, you should just use torch.save.

ThomasG · August 12, 2021, 10:31am

Cool. As far as I can see, PreTrainedModel inherits from torch.nn.Module, so I guess there shoulnd’t be much difference.

Thank you for everything. Have a nice day.

Topic		Replies	Views
If I use trainer.train() and then save the model, is that still useful? Beginners	4	2786	June 24, 2022
How to save and load fine-tune model 🤗Transformers	4	24714	October 25, 2021
Saving custom and/or finetuned models without the HUB Beginners	3	1052	March 2, 2022
How to save my model to use it later Beginners	17	176561	July 28, 2025
Trainer.save_pretrained(modeldir) AttributeError: 'Trainer' object has no attribute 'save_pretrained' Beginners	3	4494	March 21, 2023

What is the purpose of save_pretrained()?

Related topics