Using Trainer class + 4/8 bit quantised model for prediction

starsplash · April 24, 2024, 8:17am

Hi all,

I would like to run trainer.predict for generation with e.g. mistralai/Mistral-7B-v0.1.

The trainer is initialised like

trainer = Seq2SeqTrainer(model,
                      args=training_args
                      ...
                      )

However, when I want to initialise the trainer with a model in 4 or 8 bit, I get following error:

ValueError: You cannot perform fine-tuning on purely quantized models. Please attach trainable adapters on top of the quantized model to correctly perform fine-tuning. Please see: Load adapters with 🤗 PEFT for more details

Well, I don’t want to perform fine-tuning, just prediction.

Any smart way to walk around this error?

Thanks for your help!

Cheers,
Stephan

Topic		Replies	Views
Evaluating with quantized model using Trainer Beginners	1	272	August 10, 2024
Peft Model For SequenceClassification failing _is_peft_model Beginners	0	896	February 11, 2024
8 bit precision error Models	0	407	March 30, 2024
Resolving "Cannot Perform Fine-Tuning on Purely Quantized Models" Error in Falcon Model Training? 🤗Transformers	4	8921	May 9, 2025
Using Seq2SeqTrainer for decoders? 🤗Transformers	0	85	December 25, 2024

Using Trainer class + 4/8 bit quantised model for prediction

Related topics