Failed to create LLM 'llama' from .GGUF

I was able to get some code together that let me run several of the models stored here successfully. I focused on llama 7b GGUF models. Then I wanted to fine tune it using LoRA. Then I merged the weights back in, quantized it, and converted it to GGUF. However, when I try to load the new GGUF model using the same code for the model I downloaded I get the error:
Error loading model: Failed to create LLM ‘llama’ from ‘D:\models\finalModel\finalModel.gguf’.
I’ve torn apart the ctransformer AutoModelForCausalLM.from_pretrained() method and executed in a scratch environment and both models seem to be loading pretty much the same up until they get to avx2/ctransformers.dll and then it switches from python to c++ and I can’t continue inspecting them.
I’ve parsed the binaries as best I can and see they both have the Magic Number set to GGUF and a matching tensor count. I’m not sure how to figure out what is causing that error or what other comparisons I can do.

1 Like