I have quantized a model using BitsAndBytes, and I want to evaluate this model on some benchmark tasks. For evaluation on other models, I use trainer.evaluate() to obtain the metrics.
This is however different for quantized models. I use the following code:
...
trainer = Trainer(
model=model_quantized,
....
)
trainer.evaluate(test_data)
...
I obtain the following error:
ValueError: You cannot perform fine-tuning on purely quantized models. Please attach trainable adapters on top of the quantized model to correctly perform fine-tuning. Please see: https://huggingface.co/docs/transformers/peft for more details
I understand that the error gets raised, as I am using the Trainer object on a quantized model, but I do not intend to train this model, and only perfom evaluation.
Is there a way to avoid this error and run evaluation on a quantized model?
Thanks in advance!