from quanto import Calibration, freeze, qfloat8, qint4, qint8, quantize, calibrate
...
quantize(self.model, weights=qfloat8, activations=qfloat8)
tokenizer = AutoTokenizer.from_pretrained('mistralai/Mistral-7B-Instruct-v0.1')
tokenizer.add_special_tokens({'pad_token': '[PAD]'})
samples = load_dataset("wikitext", "wikitext-2-raw-v1", split="train[:100]")
text_data = samples['text']
input_ids = tokenizer(text_data, return_tensors='pt', padding=True, truncation=True).input_ids
with Calibration(momentum=0.9):
calibrate(self.model, tokenizer, input_ids, self.device, 10)
freeze(self.model)
I’m getting
TypeError: 'module' object is not callable
on the calibrate line, not sure why because calibrate is clearly a function. I’ll note that if I only quantize weights and skip calibration, this works fine. Any ideas on how to resolve this?
Thanks in advance!