In the code above, I’ve seen two tutorials about QLoRA. One, didn’t have the ‘bnb_4bit_use_double_quant=True’ line, and the second did.
For a bit more clarity one was using a falcon 7b model / automodelfor casualLM and the one that didnt include the line of code was using a flacon model again but a sharded fp16 version. If that makes a difference?
Thanks, not a big deal, just tryna learn about it better.