Hello dear community of HuggingFace Forums.
Recently most of DavidAU models has been deleted without any prior warning, so I lost access to discussions and tests I’ve been doing with these lost models.
Right now I need these models:
Lllama-3.1-Dark-Planet-SuperNova-8B-GGUF
Llama-3.1-1-million-ctx-DeepHermes-Deep-Reasoning-8B-GGUF
GEMMA-3-1B-IT-MAX-HORROR-GGUF
TinyLlama-1.1B-Chat-v1.0-UlttaQ-NEO1-GGUF
Only Q6 quants of these models
Any help is greatly appreciated.
1 Like
Hmm… Try search from here by parts of model names… There might be quantization models left behind by others.
Thanks for the suggestion, but the quants from others are different, and mostly in worse ways.
I tried mradermacher’s Q5_K_M and compared to DavidAU’s Q5_K_M, and noticed significant changes, especially with more advanced instructions and further development.
Also Q5_K_M and Q6_K_M are different, this is why I prefer Q6 over Q5.
1 Like
Hmm, I kind of get why it disappeared, but I don’t really understand why there’s such a big difference in GGUF content…
Unless it’s based on quantizing using high-precision weights only the original has, it’s probably just whether i-matrix quant or not for example…?
Graph shows the change in PPL, but not BPW; BPW changes often have diminishing impact at higher values, and starting from 6+ (Q6), the changes are nearly unnoticeable, unless very long-term results.
5bpw+ (Q5) is my recommended minimum, anything below is not recommended if used with very long-term context. The deviation (KL divergence) is ~3.2%
6bpw+ (Q6) is where the changes become almost unnoticeable over long-term context. The deviation is ~1.1%
8bpw+ (Q8) is virtually near-perfect, the deviation is approximately <0.7%
1 Like