Is this needed: bnb 4bit use double quant = True?

cryptojointer · August 13, 2023, 5:14pm

bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type=“nf4”,
bnb_4bit_compute_dtype=torch.float16,
)

In the code above, I’ve seen two tutorials about QLoRA. One, didn’t have the ‘bnb_4bit_use_double_quant=True’ line, and the second did.

For a bit more clarity one was using a falcon 7b model / automodelfor casualLM and the one that didnt include the line of code was using a flacon model again but a sharded fp16 version. If that makes a difference?

Thanks, not a big deal, just tryna learn about it better.

pogpog · March 5, 2024, 5:03pm

It further reduces the average memory footprint by quantizing the quantization constants.

See Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA

“options include bnb_4bit_use_double_quant which uses a second quantization after the first one to save an additional 0.4 bits per parameter.”

“A rule of thumb is: use double quant if you have problems with memory, use NF4 for higher precision, and use a 16-bit dtype for faster finetuning.”

baiyutinghuggingface · June 27, 2024, 3:22am

So, what is the cost of using bnb_4bit_use_double_quant? Does enabling bnb_4bit_use_double_quant make nan loss easier during training?

kyars · March 7, 2025, 4:03pm

This is very clutch thanks for the rule of thumb

Topic		Replies	Views
Qunatized model with LORA takes much more GPU memory than the un-quantized model with LORA for the (E-5-Large Embedding Transformer) 🤗Transformers	4	1815	October 8, 2023
Why isn't quantization config reducing memory usage? Intermediate	0	120	August 16, 2024
Understanding how changing bnb_4bit_compute_dtype affects outputs 🤗Transformers	1	4924	February 10, 2024
QLoRA memory requirement with 3B model loads GPU with 10GB of memory with 4bit quantization Intermediate	0	1206	December 19, 2023
Qlora - 8 bit quantization using bitsandbytes gives error for owl-vit model Intermediate	1	499	April 12, 2024

Is this needed: bnb 4bit use double quant = True?

Related topics