Less Trainable Parameters after quantization

Devrim · October 31, 2023, 7:47pm

+1 looking for the same question, why the number of parameters of the model (all parameters) reduce as well with just quantization, this method should not prune the model in anyway ?

One explanation I have is from @dfrank 's ChatGPT answer stating that with quantization some parameters may become 0 and overall the matrices may become sparser so that pruning can be applied, which sounds logical, but how this procedure is justified in the construction of the model (if torch or bitsandbytes does this automatically) ?

Topic		Replies	Views
Number of parameters reduced after loading in 4bit Models	7	934	June 28, 2024
Parameter Count & Shape Discrepancies in 4-bit vs. Higher bit LLM models 🤗Transformers	2	680	June 3, 2024
Does quantization compress the model weights? Research	16	378	September 26, 2024
Difference in Number of Parameters for load_in_4bit Beginners	0	556	August 2, 2023
Loading quantised weights does not work Beginners	0	122	April 12, 2024

Less Trainable Parameters after quantization

Related topics