Hello friends! I want to fine tune a quantized RoBERTa base model using the QLORA approach. Below is the configuration. bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type=“nf4”, bnb_4bit_compute_dtype=torch.bfloat16, llm_int8_skip_modules=[…

Bitsandbytes quantization and QLORA fine-tuning

lilylii November 5, 2024, 9:18pm 2

Do you have an answer/clarification for this? I have similar confusion - I am fine-tuning Llama 3.1 with QLoRA and am unable to load the model to have tensor.dtype=torch.bfloat16

1 Like

Topic		Replies	Views
Understanding how changing bnb_4bit_compute_dtype affects outputs 🤗Transformers	1	4793	February 10, 2024
Fine tuning for Llama2 based model with LoftQ quantization 🤗Transformers	7	2382	January 24, 2024
Qunatized model with LORA takes much more GPU memory than the un-quantized model with LORA for the (E-5-Large Embedding Transformer) 🤗Transformers	4	1780	October 8, 2023
How to generate using a fine-tuned qlora cast to bfloat16 Beginners	1	1209	April 6, 2024
Qlora - 8 bit quantization using bitsandbytes gives error for owl-vit model Intermediate	1	496	April 12, 2024

Bitsandbytes quantization and QLORA fine-tuning

Related topics