Move trainer.save_pretrained("xyz") to CPU

Hi,

I am using the Trainer class to train a language model on my RTX 3060. It fits just so on the 3060’s 12 GB of GPU RAM. Once training has finished, I would like to save the model via trainer.save_pretrained("my_local_folder"). However, that command leads to RuntimeError: CUDA out or memory.

So I have tried the following to move the trainer to the CPU and save it from there:

  1. trainer.to("cpu").save_pretrained("my_local_folder")
    or
  2. trainer.to("cpu")
    trainer.save_pretrained("my_local_folder")
    or
  3. trainer.push_to_hub("my_hub_model)

yet in any case, I still get the same RuntimeError. How can I save my model?

Your help is much appreciated!
Matthias

ChatGPT had the answer (adapted):

save_folder = "model_and_tokenizer"
model = trainer.model                    # trick: get the trained model from trainer
model.cpu().save_pretrained(save_folder) # now, it's easy 
tokenizer.save_pretrained(save_folder)   # also save the tokenizer