Quantization before or after fine-tuning?

tidealwari · May 5, 2024, 11:27am

Hey everyone! I hope you’re all having a great day.
So, I was messing around with this AutoGPTQ library and trying to quantize a codeLLM model (from the bigcode family). But I’ve got a couple of questions:

When should you usually quantize a model, before or after the fine-tuning stage?
Any particular reason why it’s better to do it before/after the fine-tuning?

Thanks!

Topic		Replies	Views
Order between optimization and quantization 🤗Optimum	1	518	September 19, 2023
How to improve model latency using quantization Beginners	0	318	December 27, 2021
Quantize a Model before loading it for pre-training? Intermediate	0	135	May 7, 2024
Finetuned I-Bert for question answering task 🤗Transformers	1	484	November 22, 2021
How to upload a quantized model? Models	5	4646	June 17, 2021

Quantization before or after fine-tuning?

Related topics