Quantization before or after fine-tuning?

Hey everyone! I hope you’re all having a great day.
So, I was messing around with this AutoGPTQ library and trying to quantize a codeLLM model (from the bigcode family). But I’ve got a couple of questions:

  1. When should you usually quantize a model, before or after the fine-tuning stage?
  2. Any particular reason why it’s better to do it before/after the fine-tuning?

Thanks!