How to train a >100GB model with hugging face trainer

maxBing12345 · March 28, 2023, 3:14am

Hi, I want to train a model with >100GB, and it will OOM if I load using from_pretrained. What’s the suggestion of loading and saving 100GB models. For training, I can use FSDP to distribute weights across devices but I am stuck in model loading and saving.

I find this article but it only supports inference.

hsuyab · May 8, 2023, 7:26pm

@maxBing12345 did you find any solution?

Shaier · May 8, 2023, 7:33pm

Not sure if this can be done because I never tried this, but can you push it to the hub? Then you can just load/save from it

trieudemo11 · May 9, 2023, 1:56am

You can take a look at this repo for big models loading GitHub - huggingface/peft: 🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

For saving, you can use save_pretrained and set push_to_hub = True to push to HF hub, you can also set max_shard_size to shard the big models into smaller files.

Topic		Replies	Views
Prakash Hinduja Switzerland (Swiss) How do I load a pre-trained model in Hugging Face? Beginners	1	23	June 26, 2025
About Adapter Fusion 🤗Hub	3	125	October 10, 2024
Uploading a large trained model Spaces	6	1981	May 15, 2024
Best way to save big model Beginners	3	231	November 14, 2024
Model.save_pretrained is not saving .bin files! model.push_to_hub is not pushing my model in my HuggingFace directory! What am I missing? Help Beginners	11	4092	February 25, 2025

How to train a >100GB model with hugging face trainer

Related topics