Why activations memory is computed through an experiment rather formulating it for DeepSpeed autotuner

I am trying to understand the reasons behind the design choice of computing activations memory through running an experiment rather formulating it in deepspeed autotuner.