ValueError: cannot reshape array of size (GGUF)

H4rryM3ss · July 29, 2024, 11:17pm

Hey guys,

I’m trying to load this GGUF model dolphin-2.0-mistral-7b.Q4_K_M.gguf with this Python code:

model_name = "TheBloke/dolphin-2.0-mistral-7B-GGUF"
model_file = "dolphin-2.0-mistral-7b.Q4_K_M.gguf"
tokenizer = AutoTokenizer.from_pretrained(model_name, gguf_file=model_file)
model = AutoModelForCausalLM.from_pretrained(model_name, gguf_file=model_file).to(device)

It loads the tokenizer, but when it tries to load the model I get:

ValueError: cannot reshape array of size 36864000 into shape (222,72)

Which breaks the application.

Am I loading it wrongly? I was following this guide.

jpuser1 · July 31, 2024, 1:59pm

I also see the same type of error.

filename = "Meta-Llama-3.1-8B-Instruct-Q8_0.gguf"
tokenizer = AutoTokenizer.from_pretrained(model_id, gguf_file=filename)
model = AutoModel.from_pretrained(model_id, gguf_file=filename, local_files_only=True)

Converting and de-quantizing GGUF tensors…: 0%| | 1/292 [00:00<00:00, 2328.88it/s]
Traceback (most recent call last):
File “./tokenize_text.py”, line 15, in
model = AutoModel.from_pretrained(model_id, gguf_file=filename, local_files_only=True)
File “/root/anaconda3/envs/py38/lib/python3.8/site-packages/transformers/models/auto/auto_factory.py”, line 564, in from_pretrained
return model_class.from_pretrained(
File “/root/anaconda3/envs/py38/lib/python3.8/site-packages/transformers/modeling_utils.py”, line 3661, in from_pretrained
state_dict = load_gguf_checkpoint(gguf_path, return_tensors=True)[“tensors”]
File “/root/anaconda3/envs/py38/lib/python3.8/site-packages/transformers/modeling_gguf_pytorch_utils.py”, line 148, in load_gguf_checkpoint
weights = load_dequant_gguf_tensor(shape=shape, ggml_type=tensor.tensor_type, data=tensor.data)
File “/root/anaconda3/envs/py38/lib/python3.8/site-packages/transformers/integrations/ggml.py”, line 493, in load_dequant_gguf_tensor
values = dequantize_q8_0(data)
File “/root/anaconda3/envs/py38/lib/python3.8/site-packages/transformers/integrations/ggml.py”, line 335, in dequantize_q8_0
scales = np.frombuffer(data, dtype=np.float16).reshape(num_blocks, 1 + 16)[:, :1].astype(np.float32)
ValueError: cannot reshape array of size 279085056 into shape (3772,17)

Same symptoms, the tokenizer loads fine but then I get this reshape array error when trying to create the model. I have tried various different GGUF files with 4 bit quantization and 8 bit quantization but it doesn’t seem to matter.
I’m not sure what I am doing wrong.

jpuser1 · July 31, 2024, 7:20pm

Possibly due to this bug:

github.com/huggingface/transformers

ValueError: cannot reshape array of size <number> into shape

opened 03:01PM - 30 Jul 24 UTC

TheSlackOne

bug

### System Info Python 3.11.2 Debian 12 ctransformers 0.2.27 tran…sformers 4.42.4 ### Who can help? @ArthurZucker ### Information - [X] The official example scripts - [ ] My own modified scripts ### Tasks - [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...) - [ ] My own task or dataset (give details below) ### Reproduction Following the example on Hugging Face: https://huggingface.co/docs/transformers/en/gguf ``` import torch from transformers import AutoModelForCausalLM, AutoTokenizer from huggingface_hub import hf_hub_download from llama_cpp import Llama # Ensure the model is loaded on the GPU device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') print(f"device set to: {device}") # Load the model and tokenizer model_name = "TheBloke/dolphin-2.0-mistral-7B-GGUF" model_file = "dolphin-2.0-mistral-7b.Q4_K_M.gguf" print(f"Loading tokenizer from {model_name}.") tokenizer = AutoTokenizer.from_pretrained(model_name, gguf_file=model_file) print(f"Loading model from repo {model_name}. Model file name {model_file}.") model = AutoModelForCausalLM.from_pretrained(model_name, gguf_file=model_file) ``` I get ``` ValueError: cannot reshape array of size 36864000 into shape (222,72) ``` ### Expected behavior Model should be loaded.

jpuser1 · July 31, 2024, 9:51pm

@H4rryM3ss I was able to get past this error by cloning the huggingface/transformers repository and running the latest code from there. At least the model loads now.

H4rryM3ss · July 31, 2024, 10:04pm

Hey @jpuser1, thanks for letting me know. I had workaround this by loading the model with Llama. But I will give a try to what you have described then.

Regards.

Topic		Replies	Views
Unable to run gguf model Models	1	954	January 6, 2025
NotImplementedError: ggml_type 21 not implemented 🤗Transformers	2	82	September 23, 2024
Failed to create LLM 'llama' from .GGUF Beginners	0	299	December 25, 2024
Why I am getting this problem while running any of the GGUF model instead of with .bin model Models	1	2622	November 7, 2023
Hugging Face to GGUF Conversion Broken? 🤗Hub	1	5328	February 11, 2024

ValueError: cannot reshape array of size (GGUF)

Related topics