I am trying to load a model TheBloke/WizardLM-30B-Uncensored-GPTQ
downloaded from Huggingface by using transformers.AutoModelForCausalLM.from_pretrained
.
When I try to do this using the below code, i get the error
OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory /text-generation-webui/models/TheBloke_WizardLM-30B-Uncensored-GPTQ.
I looked into that directory and see the following files
added_tokens.json
config.json
generation_config.json
huggingface-metadata.txt
quantize_config.json
README.md
special_tokens_map.json
tokenizer_config.json
tokenizer.json
tokenizer.model
WizardLM-30B-Uncensored-GPTQ-4bit.act-order.safetensors
What should be the proper way to load this huggingface model?
import torch
from transformers import AutoModelForCausalLM, BitsAndBytesConfig
model_id = "/text-generation-webui/models/TheBloke_WizardLM-30B-Uncensored-GPTQ"
nf4_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
)
model = AutoModelForCausalLM.from_pretrained(
model_id, quantization_config=nf4_config, device_map={"": 0}
)