“Use this model”->Ollama: can't pull model with Q4

DmitryZvonarev · May 15, 2025, 8:21am

Can’t pull Q4 model from IlyaGusev/saiga_llama3_8b_gguf · Hugging Face via “Use this model”->Ollama button.
All Q2, Q4, Q8, f16 exist in repository, I can download files, but “Use this model”->Ollama only works with Q2_K, Q8_0, F16. It worked month ago.

By default the model field is empty, but if I copy I will get: “ollama run hf.co/IlyaGusev/saiga_llama3_8b_gguf:Q4_K_M”

If I run ollama pull hfco/IlyaGusev/saiga_llama3_8b_gguf:Q4_K_M output is “The specified tag is not available in the repository. Please use another tag or latest”

If I change tag and run ollama pull hfco/IlyaGusev/saiga_llama3_8b_gguf:Q4_K output is “The specified tag is not a valid quantization scheme. Please use another tag or latest”

Is this HF issue?

John6666 · May 15, 2025, 8:52am

I’m not sure if this is an HF issue or an Ollama issue…

Either way, it’s an issue.

github.com/huggingface/chat-ui

Ollama model names in example are wrong and auto-pulling broken

opened 12:26AM - 12 Jan 25 UTC

morganhein

bug

## Bug description The configuration here: https://github.com/huggingface/cha…t-ui/blob/main/docs/source/configuration/models/providers/ollama.md says that the model name should be "mistral". It should be "mistral:latest" instead. Furthermore, the auto-pulling of ollama models originally added here: https://github.com/huggingface/chat-ui/pull/1227 does not work with this model name inconsistency. ## Steps to reproduce Vanilla chat-ui setup with ollama installed. Use config from above. Download mistral "ollama pull mistral". Chat-ui unable to use ollama/mistral ## Context ### Logs Here's a snippet from ollama's tags: ``` $> curl http://localhost:11434/api/tags {"models":[{"name":"mistral:latest","model":"mistral:latest","modified_at":"2025-01-11T16:17:32.785621658-08:00","size":4113301824,"digest":"f974a74358d62a017b37c6f424fcdf2744ca02926c4f952513ddf474b2fa5091","details":{"parent_model":"","format":"gguf","family":"llama","families":["llama"],"parameter_size":"7.2B","quantization_level":"Q4_0"}},{"name":"tinyllama:latest","model":"tinyllama:latest","modified_at":"2025-01-11T16:03:16.107607114-08:00","size":637700138,"digest":"2644915ede352ea7bdfaff0bfac0be74c719d5d5202acb63a6fb095b52f394a4","details":{"parent_model":"","format":"gguf","family":"llama","families":["llama"],"parameter_size":"1B","quantization_level":"Q4_0"}}]} ``` ### Specs - **OS**: Mac - **Browser**: Firefox - **chat-ui commit**: f82af0b ## Notes If you change the config to: ``` MONGODB_URL=mongodb://localhost:27017/ MODELS=`[ { "name": "Ollama Mistral", "chatPromptTemplate": "<s>{{#each messages}}{{#ifUser}}[INST] {{#if @first}}{{#if @root.preprompt}}{{@root.preprompt}}\n{{/if}}{{/if}} {{content}} [/INST]{{/ifUser}}{{#ifAssistant}}{{content}}</s> {{/ifAssistant}}{{/each}}", "parameters": { "temperature": 0.1, "top_p": 0.95, "repetition_penalty": 1.2, "top_k": 50, "truncate": 3072, "max_new_tokens": 1024, "stop": ["</s>"] }, "endpoints": [ { "type": "ollama", "url" : "http://127.0.0.1:11434", "ollamaName" : "mistral:latest" } ] } ]` ``` Notice the ollamaName: "mistral:latest" it works.

mahmutc · May 15, 2025, 9:56am

I’m not sure if this is a bug, but I noticed an inconsistency in the file type labels for these GGUF models:

Model (Q2_K):

Link: saiga_llama3_8b_gguf - model-q2_K.gguf
Label: general.file_type → Q2_K

Model (Q4_K):

Link: saiga_llama3_8b_gguf - model-q4_K.gguf
Label: general.file_type → Q4_K_M

Is this a labeling inconsistency, or could it indicate an issue with the GGUF file format?

The Q2_K file is labeled simply as Q2_K, while the Q4_K file shows Q4_K_M.
Shouldn’t both follow the same naming convention (e.g., Qx_K)?

Could this affect compatibility or is it just a metadata oversight?

John6666 · May 15, 2025, 10:25am

There are no specific rules for file names…
Either way seems fine.

If it was working fine before (a month ago), then either Ollama or HF was pulling files according to the file name rules, or vice versa.

DmitryZvonarev · May 15, 2025, 10:28am

I don’t know, but I have already successfully used q4 model.
As John6666 said, it seems like HF “Use this model” integration became more “strict” and began to check the name in the model metadata with the name of file\quantization.

Topic		Replies	Views
BUG: can't fetch certain GGUFs 🤗Datasets	5	30	January 6, 2025
Lama 3.23b performs great when I download and use using ollama but when I manually download the model or if I use the gguf model by unsloth, it gives me irrelevant response. Please help me out Beginners	9	1355	October 31, 2024
How to use hugging face to fine-tune ollama's local model Beginners	7	8111	August 28, 2024
LLAMA-2 Download issues Models	8	7867	November 7, 2023
What is the context length when using ollama to pull HF GGUF Beginners	3	100	March 31, 2025

“Use this model”->Ollama: can't pull model with Q4

Related topics