Hey there,
unfortunately I do not have a GPU. I want to run the mistral code just on my CPU. I have a RAG project and all works just fine except the following mistral part:
from transformers import DPRContextEncoder, DPRContextEncoderTokenizer
import torch
import faiss # used for indexing pip install faiss-cpu
from transformers import (RagRetriever,
RagSequenceForGeneration,
RagTokenizer)
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
.....
print("Mistral Models")
model_name_or_path = "TheBloke/Mistral-7B-Instruct-v0.1-GPTQ"
# To use a different branch, change revision
# For example: revision="gptq-4bit-32g-actorder_True"
tokenizer_2 = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=True)
print("HUUUHUUUU")
# # # removed cude here: device_map="cuda:0",
model_2 = AutoModelForCausalLM.from_pretrained(model_name_or_path,
trust_remote_code=False,
revision="gptq-4bit-32g-actorder_True")
# Save the Mistral model and tokenizer
print("saving mistral now....")
model_2.save_pretrained("mistralModel")
tokenizer_2.save_pretrained("mistralTokenizer")
The error I get is:
.....
Mistral Models
HUUUHUUUU
CUDA extension not installed.
CUDA extension not installed.
Traceback (most recent call last):
File ".....\test.py", line 263, in <module>
model_2 = AutoModelForCausalLM.from_pretrained(model_name_or_path,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".....\Python311\Lib\site-packages\transformers\models\auto\auto_factory.py", line 566, in from_pretrained
return model_class.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".....\Python311\Lib\site-packages\transformers\modeling_utils.py", line 3928, in from_pretrained
model = quantizer.post_init_model(model)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".....Python311\Lib\site-packages\optimum\gptq\quantizer.py", line 587, in post_init_model
raise ValueError(
ValueError: Found modules on cpu/disk. Using Exllama or Exllamav2 backend requires all the modules to be on GPU.You can deactivate exllama backend by setting `disable_exllama=True` in the quantization
config object
I tried already many diffrent things but I could not fix this issue I would be pleased by any kind of help.
Thx Markus