Possible to use transformers with GGML-style quantization?

neilf79 · February 10, 2024, 2:50pm

Hi!

Is it possible to use the transformers library with GGML-style quantization? For example, GGML_TYPE_Q5_K (as detailed here).

I’m aware that I can use bitsandbytes to do 8-bit and 4-bit quantization, but I’d prefer to use GGML-style quantization.

Thanks!

Topic		Replies	Views
NotImplementedError: ggml_type 21 not implemented 🤗Transformers	2	79	September 23, 2024
Does quantization compress the model weights? Research	16	369	September 26, 2024
Unable to run gguf model Models	1	876	January 6, 2025
Fine tuning gguf models? 🤗Transformers	1	1431	April 30, 2024
4-bit quantization Intermediate	0	468	November 18, 2023