Is there a way to do 4 bit quantization using optimum?
Hi @sameearif88, GPTQ will soon be available in Optimum to enable 4-bit quantization. You can follow the ongoing PR here: https://github.com/huggingface/optimum/pull/1216
1 Like
Hello,
Is it possible to 4 bit quantize OpenAI Whisper and Facebook MMS audio models using Optimum?
Thanks
@sameearif88 GPTQ only works for text models at the moment so it won’t be possible to perform 4-bit quantization of speech models right away.
1 Like
@sameearif88 Feel free to open a feature request for it on GitHub.
1 Like