Failed to create LLM 'llama' from .GGUF

VolcanoJ · December 25, 2024, 6:24am

I was able to get some code together that let me run several of the models stored here successfully. I focused on llama 7b GGUF models. Then I wanted to fine tune it using LoRA. Then I merged the weights back in, quantized it, and converted it to GGUF. However, when I try to load the new GGUF model using the same code for the model I downloaded I get the error:
Error loading model: Failed to create LLM ‘llama’ from ‘D:\models\finalModel\finalModel.gguf’.
I’ve torn apart the ctransformer AutoModelForCausalLM.from_pretrained() method and executed in a scratch environment and both models seem to be loading pretty much the same up until they get to avx2/ctransformers.dll and then it switches from python to c++ and I can’t continue inspecting them.
I’ve parsed the binaries as best I can and see they both have the Magic Number set to GGUF and a matching tensor count. I’m not sure how to figure out what is causing that error or what other comparisons I can do.

Topic		Replies	Views
Why I am getting this problem while running any of the GGUF model instead of with .bin model Models	1	2633	November 7, 2023
Ctransformers error : Failed to create LLM 'stablelm' 🤗Transformers	1	863	November 30, 2023
Running GGUF model files using Auto classes 🤗Transformers	2	2445	March 2, 2024
Unable to run gguf model Models	1	1061	January 6, 2025
ValueError: cannot reshape array of size (GGUF) 🤗Transformers	4	830	July 31, 2024

Failed to create LLM 'llama' from .GGUF

Related topics