Using Trainer class + 4/8 bit quantised model for prediction

Hi all,

I would like to run trainer.predict for generation with e.g. mistralai/Mistral-7B-v0.1.

The trainer is initialised like

trainer = Seq2SeqTrainer(model,
                      args=training_args
                      ...
                      )

However, when I want to initialise the trainer with a model in 4 or 8 bit, I get following error:

ValueError: You cannot perform fine-tuning on purely quantized models. Please attach trainable adapters on top of the quantized model to correctly perform fine-tuning. Please see: Load adapters with 🤗 PEFT for more details

Well, I don’t want to perform fine-tuning, just prediction.

Any smart way to walk around this error?

Thanks for your help!

Cheers,
Stephan

1 Like
# create LoRA configuration object
lora_config = LoraConfig(
    task_type=TaskType.CAUSAL_LM, # type of task to train on
    inference_mode=False, # set to False for training
    r=8, # dimension of the smaller matrices
    lora_alpha=32, # scaling factor
    lora_dropout=0.1 # dropout of LoRA layers
)

model.add_adapter(lora_config, adapter_name="lora_1")

# it worked for me when i added model.add_adapter()

or 
trainer = SFTTrainer(
    model,
    args=sft_config,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    peft_config=lora_config,
)
 you can pass lora_config in the trainer

1 Like