I’m using bitsandbytes int8 and PEFT with transformers. Is there any advantage to using bf16 with int8 even though it gets cast to fp16 during quantization? Or should this be strictly avoided?
I’m using bitsandbytes int8 and PEFT with transformers. Is there any advantage to using bf16 with int8 even though it gets cast to fp16 during quantization? Or should this be strictly avoided?