How to predict the memory requirements for a given model?

silvia-casola · June 9, 2022, 6:53am

Hi! I was wondering if there was any exact way or a rule of thumb to determine the GPU memory requirement for training a model given the input and output sequence length (I’m specifically interested in seq2seq models), the configuration and the model type. Moreover, is there any good practice to decrease such requirement?
Thanks.

Topic		Replies	Views
Estimate training compute for 150B LLM DeepSpeed	0	531	June 30, 2023
Cuda out of memory issue training whisper model on single GPU Intermediate	0	913	December 15, 2023
Different Inference Speed for same size models Models	0	389	August 29, 2021
Parameters that contribute to GPU Memory Models	0	246	November 23, 2023
Why does all my gpu memory get used with a small model? Beginners	5	2141	March 13, 2022

How to predict the memory requirements for a given model?

Related topics