I want to profile all layers of a model, meaning the time, memory, performance (IPC for instance).
From a Pytorch perspective, there is the Pytorch profiler (PyTorch Profiler — PyTorch Tutorials 2.2.0+cu121 documentation) and fordard/backward hooks to layers (this won’t allow me to measure the layer, only track the start of it).
The Pytorch profile seems a good approach, but for models from Hugging Face fails to provide useful information about the layers.
For instance, when using Pytorch profiler with the model GPT-J (from GPT-J), I get the following output, which shows no layer but other auxiliary functions:
---------------------- ------------ ------------ ------------ ------------ ------------ ------------
Name Self CPU % Self CPU CPU total % CPU total CPU time avg # of Calls
---------------------- ------------ ------------ ------------ ------------ ------------ ------------
forward 90.29% 558.000us 94.34% 583.000us 583.000us 1
aten::zeros 5.02% 31.000us 5.66% 35.000us 35.000us 1
aten::unbind 1.62% 10.000us 2.43% 15.000us 15.000us 1
aten::detach 0.49% 3.000us 1.29% 8.000us 8.000us 1
aten::select 0.65% 4.000us 0.81% 5.000us 5.000us 1
detach 0.81% 5.000us 0.81% 5.000us 5.000us 1
aten::empty 0.65% 4.000us 0.65% 4.000us 2.000us 2
aten::zero_ 0.16% 1.000us 0.16% 1.000us 1.000us 1
aten::as_strided 0.16% 1.000us 0.16% 1.000us 1.000us 1
aten::to 0.16% 1.000us 0.16% 1.000us 1.000us 1
aten::resolve_conj 0.00% 0.000us 0.00% 0.000us 0.000us 1
aten::resolve_neg 0.00% 0.000us 0.00% 0.000us 0.000us 1
---------------------- ------------ ------------ ------------ ------------ ------------ ------------
Self CPU time total: 618.000us
What would be the approach to profile all layers? Let me add that I’m running on only CPU.
What would be the approach when running in a GPU? Is there a cross-platform mechanism?