Hey all - as I browse models for those that suit my project, I am trying to quickly determine the memory requirements for running each model locally. This seems like something many users would want to do and so there should be an obvious place to look on a model card to make this determination, but I don’t see any such place. I don’t even see a place where the parameter count is consistently stated. How should I approach this?
Hi! You can find this info by checking the size of pytorch_model.bin
(or tf_model.h5
/flax_model.msgpack
for TF/Flax models). These files can be sharded sometimes (if pytorch_model.bin.index.json
is present), in which case you need to sum up all the shards listed in the index file.
PS: For the parameter count to be displayed, the weights must be saved in the safetensors
format.
Thanks very much for the help.
How would I determine these requirements when I’m finetuning a model? I’m finetuning a 780 MB model with a 5 MB dataset. On my local machine this runs fine (16 GB ram) but when I use a GPU (Tesla T4 with 16 GB of GPU ram), I immediately get an OOM error when I launch trainer.train()
.
Have you tried Model Memory Calculator? Model Memory Utility - a Hugging Face Space by hf-accelerate
Nice link, thanks.
So finetuning should consume less memory than training, and training the model I’m using is just 4.57 GB according to the calculator. So I’m probably misunderstanding something about the model architecture since I have 16 GB GPU RAM.
Total memory required during training/tuning should depend on the batch size right? Why batch size is not an input to this utility?
The batch size is automatically set to one as said in, “When training on a batch size of 1”.