Run pre-trained LLM model on CPU - ValueError: Expected a cuda device, but got: cpu

hliuci · April 17, 2024, 4:10am

Hi, I am using a LLM model, CohereForAI/c4ai-command-r-plus-4bit, to do some inference. I have a GPU but it’s not powerful enough so I want to use CPU. Below are the example codes and problems.

Code:

from transformers import AutoTokenizer, AutoModel, AutoModelForCausalLM

PRETRAIN_MODEL = 'CohereForAI/c4ai-command-r-plus-4bit'
tokenizer = AutoTokenizer.from_pretrained(PRETRAIN_MODEL)
model = AutoModelForCausalLM.from_pretrained(PRETRAIN_MODEL, device_map='cpu')

text = "this is an example"
inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
    outputs = model(**inputs)
    embedding = outputs.last_hidden_state.mean(dim=1).squeeze().numpy()
print(embedding.shape)

Error:

ValueError: Expected a cuda device, but got: CPU

Transformer version information:

transformers version: 4.40.0.dev0

Platform: Linux-5.4.0-150-generic-x86_64-with-glibc2.27

Python version: 3.11.8

Huggingface_hub version: 0.20.3

Safetensors version: 0.4.2

Accelerate version: 0.29.2

Accelerate config: not found

PyTorch version (GPU?): 2.2.2 (True)

Tensorflow version (GPU?): not installed (NA)

Flax version (CPU?/GPU?/TPU?): not installed (NA)

Jax version: not installed

JaxLib version: not installed

Using GPU in script?: Not

Using distributed or parallel set-up in script?: Not

Does it mean that the c4ai-command-r-plus-4bit model can only run on GPU? Is there anything I missed to run it on CPU? Thanks!

Topic		Replies	Views
Is Transformers using GPU by default? Beginners	6	154366	December 11, 2023
How to use GPU when using transformers.AutoModel DeepSpeed	0	1678	February 3, 2024
Need help performance issues transformers.AutoModelForCausalLM.from_pretrained( 'mosaicml/mpt-7b-instruct' Beginners	0	930	June 12, 2023
BLOOM models don't run on my GPU using Transformers 🤗Transformers	1	1661	September 18, 2022
Loading a HF Model in Multiple GPUs and Run Inferences in those GPUs 🤗Accelerate	10	9576	October 16, 2024

Run pre-trained LLM model on CPU - ValueError: Expected a cuda device, but got: cpu

Related topics