GPTQ and AWQ quantized model doesn't work

I am using Space to test out model with Docker template + ChatUI, all quantize model could build but doesn’t have any response. When I type question on the chat box, it keeps loading and never give out any answer, checked log the last line is like this,

INFO compat_generate{default_return_full_text=true compute_type=Extension(ComputeType(“1-nvidia-a10g”))}:generate_stream{parameters=GenerateParameters { best_of: None, temperature: Some(0.2), repetition_penalty: Some(1.2), frequency_penalty: None, top_k: Some(50), top_p: Some(0.95), typical_p: None, do_sample: false, max_new_tokens: Some(1024), return_full_text: Some(false), stop: , truncate: Some(1000), watermark: false, details: false, decoder_input_details: false, seed: None, top_n_tokens: None, grammar: None } total_time=“33.175176192s” validation_time=“1.268962ms” queue_time=“70.212µs” inference_time=“33.173837168s” time_per_token=“6.634767433s” seed=“Some(3004468518659526318)”}: text_generation_router::server: router/src/ Success

Does anyone know why?