Resolving "Cannot Perform Fine-Tuning on Purely Quantized Models" Error in Falcon Model Training?

I am training Falcon model for QnA task and I want to use it for medical QnA but while traning I got the following error I also tried installing bitsandbytes version mentioned below and I am srtill facing the same issue can anyone suggest a better idea?

How to resolve this error


training_args = transformers.TrainingArguments(
      per_device_train_batch_size=4,
      gradient_accumulation_steps=4, #4
      num_train_epochs=6,
      learning_rate=2e-4,
      fp16=True,
      save_total_limit=3,
      logging_steps=500,
      output_dir="experiments",
      optim="paged_adamw_8bit",
      lr_scheduler_type="cosine",
      warmup_ratio=0.05,
      push_to_hub=True,
)
trainer = transformers.Trainer(
    model=model,
    train_dataset=train_data_transformed,
    args=training_args,
    data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm=False)
)
model.config.use_cache = False

error is:

ValueError                                Traceback (most recent call last)
Cell In[76], line 1
----> 1 trainer = transformers.Trainer(
      2     model=model,
      3     train_dataset=train_data_transformed,
      4     args=training_args,
      5     data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm=False)
      6 )
      7 model.config.use_cache = False

File /opt/conda/lib/python3.10/site-packages/transformers/trainer.py:408, in Trainer.__init__(self, model, args, data_collator, train_dataset, eval_dataset, tokenizer, model_init, compute_metrics, callbacks, optimizers, preprocess_logits_for_metrics)
    406 # At this stage the model is already loaded
    407 if _is_quantized_and_base_model and not _is_peft_model:
--> 408     raise ValueError(
    409         "You cannot perform fine-tuning on purely quantized models. Please attach trainable adapters on top of"
    410         " the quantized model to correctly perform fine-tuning. Please see: https://huggingface.co/docs/transformers/peft style="color:rgb(175,0,0)">"
    411         " for more details"
    412     )
    413 elif _is_quantized_and_base_model and not getattr(model, "_is_quantized_training_enabled", False):
    414     raise ValueError(
    415         "The model you want to train is loaded in 8-bit precision.  if you want to fine-tune an 8-bit"
    416         " model, please make sure that you have installed `bitsandbytes>=0.37.0`. "
    417     )

ValueError: You cannot perform fine-tuning on purely quantized models. Please attach trainable adapters on top of the quantized model to correctly perform fine-tuning. Please see: https://huggingface.co/docs/transformers/peft for more details

Like the error says, finetuning on purely quantized models is not possible because of certain constraints. You will have to add adapters on top of the trained model and use that. Something like peft worked for me.
Issue with Fine-tuning LLM for Classification · Issue #27702 · huggingface/transformers (github.com)
Load adapters with :hugs: PEFT (huggingface.co)

For example you can setup a LoraConfig with peft and give it as peft_config parameter to your trainer.

from peft import LoraConfig
Lora_config = LoraConfig(
r=16,
lora_alpha=32,
lora_dropout=0.05,
bias=“none”
)

trainer = transformers.Trainer(
model=model,
train_dataset=train_data_transformed,
args=training_args,
data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm=False),
peft_config=Lora_config

)

** not the expert here, but encountered the same problem