Hi all,
I would like to run trainer.predict
for generation with e.g. mistralai/Mistral-7B-v0.1
.
The trainer is initialised like
trainer = Seq2SeqTrainer(model,
args=training_args
...
)
However, when I want to initialise the trainer with a model in 4 or 8 bit, I get following error:
ValueError: You cannot perform fine-tuning on purely quantized models. Please attach trainable adapters on top of the quantized model to correctly perform fine-tuning. Please see: Load adapters with 🤗 PEFT for more details
Well, I don’t want to perform fine-tuning, just prediction.
Any smart way to walk around this error?
Thanks for your help!
Cheers,
Stephan