The memory usage about inference on CPU

sumuyou · December 2, 2024, 6:50am

I download a GPT2-3.5B model from huggingface，here is the files

I load the model just using GPT2LMHeadModel.from_pretrained(path)，the memory usage of CPU during inference is 14GB，but after i save the model using model.save_pretrained() to a new folder, then i load it to inference, the memory usage is only 1GB. I test .safetensors and .bin, and get the same result

Topic		Replies	Views
The CPU memory usage becomes very small during model inference 🤗Transformers	0	46	November 30, 2024
Question about memory usage Beginners	0	909	May 15, 2023
How is memory managed when loading a model? Beginners	2	6212	July 4, 2023
Which model for inference on 11 GB GPU? Beginners	1	394	October 30, 2021
How much memory required to load T0pp Models	4	3709	October 20, 2021

The memory usage about inference on CPU

Related topics