Less Trainable Parameters after quantization

After some investigation, I think it might be due to Linear4bit(https://github.com/TimDettmers/bitsandbytes/blob/main/bitsandbytes/nn/modules.py#L207) is setting requires_grad=False. So it will reduce a lot of parameters.