AWQ quantized version of Llama 3 8B ChatQA

Sreenington · May 3, 2024, 7:43am

Hey guys, I just quantized Llama 3 ChatQA finetuned model to 4 bit AWQ.
I thought it would be useful for people as there isn’t a good chat fine-tuned version of Llama 3. Do check it out.

Sreenington/Llama-3-8B-ChatQA-AWQ

Topic		Replies	Views
Fine tune and then successfully AWQ quantize Beginners	3	2921	February 16, 2024
Llama-2-7b-chat fine-tuning Models	4	6789	April 26, 2024
Llama-3 70b - Probability outputs appear "quantized" using non-quantized model (but not with quantized model) Beginners	0	62	August 22, 2024
Finetune a chatbot for specfic task Beginners	0	784	June 10, 2023
Can we run custom quantized llama3-8b on Npu? Models	0	61	December 6, 2024

AWQ quantized version of Llama 3 8B ChatQA

Related topics