Estimate training compute for 150B LLM

ngoquanghuy · June 30, 2023, 3:31am

Hi everyone,

I’m trying to estimate GPU memory usage needed to train an 150B language model.
This seems complex because of various factors.
What should I notice? Is there any formula for this?

Topic		Replies	Views
How to predict the memory requirements for a given model? Models	0	744	June 9, 2022
Seeking Advice on Optimizing Hardware Resources for Model Training Beginners	3	152	August 4, 2024
Computing Power for Training Languages at Hugging Face	0	538	April 7, 2021
Cuda out of memory issue training whisper model on single GPU Intermediate	0	907	December 15, 2023
Finetune LLM with DeepSpeed DeepSpeed	2	5121	February 22, 2024

Estimate training compute for 150B LLM

Related topics