I am looking for a calculator tool that can estimate the amount of memory a GPU machine instance will use in advance. Specifically, I’m interested in the following configuration:
-
FSDP Zero 2
-
Model: e.g., LLama3 8B
-Lora
-
Max tokens: 8K
-
16-bit precision
-
Gradient checkpointing
-
No quantization
Any guidance or recommendations would be greatly appreciated! Thank you.
Edit: I know about hf tool (Understanding how big of a model can fit on your machine) but this is rudimentary and lacks many of parameters such as max sequence token, lora, …