Memory Requirements for Running LLM

Are there any rule of thumb calculations for determining memory requirement (as a function of number of model parameters) for an LLM model. I’m referring to a base model (no quantization) with full fine-tuning vs no fine-tuning that I would like to run inference on. Lets use Llama2 7B as an example. Below is what I’ve seen for pre-training, but not sure how this exactly translates over to loading in the pre-trained base model and working with it. The disconnect here for me is I’m unsure what components of the model exactly are getting stored in memory. Any help appreciated!
image

https://huggingface.co/spaces/hf-accelerate/model-memory-usage is a tool.

There’s also this nice blog post: Calculating GPU memory for serving LLMs | Substratus.AI