It may be possible to use a quantized model in a CPU environment, but it would probably be faster to simply use a non-quantized model in this case.
#MODEL_ID = "Qwen/Qwen1.5-0.5B-Chat-GPTQ-Int4"
MODEL_ID = "Qwen/Qwen1.5-0.5B-Chat"
It may be possible to use a quantized model in a CPU environment, but it would probably be faster to simply use a non-quantized model in this case.
#MODEL_ID = "Qwen/Qwen1.5-0.5B-Chat-GPTQ-Int4"
MODEL_ID = "Qwen/Qwen1.5-0.5B-Chat"