2 Likes
When we load in 4 bit, the linear layers are replaced with linear 4bit layers. These layers have half the number of parameters. But still I am also not clear how number of parameters become half.
In the source code, when picking 4-bit, the parameter count is divided by 2.
I don’t know why, you can check the code
I still don’t have a clear answer for this and I would love to know, bumping for visibility.
I too has the same doubt. @sgugger @sayakpaul sorry to tag you guys… I am thinking I could use your help here.