Meta device error while instantiating model

Code:

from accelerate import init_empty_weights
from transformers import OPTForCausalLM, AutoTokenizer
import torch

with init_empty_weights():
    model = OPTForCausalLM.from_pretrained(
        "facebook/opt-1.3b",
        device_map="auto",
        offload_folder="/tmp/opt-1.3b-offload-accelerate",
    )

tokenizer = AutoTokenizer.from_pretrained("facebook/opt-1.3b")
inputs = tokenizer("Hello, my name is", return_tensors="pt")

with torch.no_grad():
    inputs = inputs.to(0)
    output = model.generate(inputs["input_ids"])
    print(tokenizer.decode(output[0].tolist()))

Traceback:

Traceback (most recent call last):
  File "run_inference.py", line 6, in <module>
    model = OPTForCausalLM.from_pretrained(
  File "/home/sh0416/anaconda3/envs/personal/lib/python3.8/site-packages/transformers/modeling_utils.py", line 2529, in from_pretrained
    dispatch_model(model, device_map=device_map, offload_dir=offload_folder, offload_index=offload_index)
  File "/home/sh0416/anaconda3/envs/personal/lib/python3.8/site-packages/accelerate/big_modeling.py", line 318, in dispatch_model
    attach_align_device_hook_on_blocks(
  File "/home/sh0416/anaconda3/envs/personal/lib/python3.8/site-packages/accelerate/hooks.py", line 488, in attach_align_device_hook_on_blocks
    attach_align_device_hook_on_blocks(
  File "/home/sh0416/anaconda3/envs/personal/lib/python3.8/site-packages/accelerate/hooks.py", line 464, in attach_align_device_hook_on_blocks
    add_hook_to_module(module, hook)
  File "/home/sh0416/anaconda3/envs/personal/lib/python3.8/site-packages/accelerate/hooks.py", line 148, in add_hook_to_module
    module = hook.init_hook(module)
  File "/home/sh0416/anaconda3/envs/personal/lib/python3.8/site-packages/accelerate/hooks.py", line 237, in init_hook
    set_module_tensor_to_device(module, name, self.execution_device)
  File "/home/sh0416/anaconda3/envs/personal/lib/python3.8/site-packages/accelerate/utils/modeling.py", line 127, in set_module_tensor_to_device
    raise ValueError(f"{tensor_name} is on the meta device, we need a `value` to put in on {device}.")
ValueError: weight is on the meta device, we need a `value` to put in on 1.

How to resolve this error…? I don’t know how to do it…

When I removed init_empty_weights context, it resolved…

exact same error when I tried to load a non sharded pytorch_model.bin into an empty weights object, here is my code

with init_empty_weights():
    model = AutoModelForCausalLM.from_config(config)

from accelerate import load_checkpoint_and_dispatch

model = load_checkpoint_and_dispatch(
    model, "/home/ec2-user/gpt-neo-2.7/", device_map=device_map
)

and here is the device map

{'transformer.wte': 0,
 'transformer.wpe': 0,
 'transformer.drop': 0,
 'transformer.h.0': 0,
 'transformer.h.1': 0,
 'transformer.h.2': 0,
 'transformer.h.3': 0,
 'transformer.h.4': 0,
 'transformer.h.5': 0,
 'transformer.h.6': 0,
 'transformer.h.7': 0,
 'transformer.h.8': 0,
 'transformer.h.9': 0,
 'transformer.h.10': 0,
 'transformer.h.11': 0,
 'transformer.h.12': 0,
 'transformer.h.13': 0,
 'transformer.h.14': 0,
 'transformer.h.15': 0,
 'transformer.h.16': 0,
 'transformer.h.17': 0,
 'transformer.h.18': 0,
 'transformer.h.19': 0,
 'transformer.h.20.ln_1': 0,
 'transformer.h.20.attn': 0,
 'transformer.h.20.ln_2': 0,
 'transformer.h.21': 1,
 'transformer.h.22': 1,
 'transformer.h.23': 1,
 'transformer.h.24': 1,
 'transformer.h.25': 1,
 'transformer.h.26': 1,
 'transformer.h.27': 1,
 'transformer.h.28': 1,
 'transformer.h.29': 1,
 'transformer.h.30': 1,
 'transformer.h.31': 1,
 'transformer.ln_f': 1,
 'lm_head': 1,
 'transformer.h.20.mlp': 1}

And it returns the same error

ValueError: weight is on the meta device, we need a `value` to put in on 1.

Isn’t the point of using a empty model on the device “meta” to avoid unnecessary loading? It seems now loading the model to a non-meta device is necessary first to use the load_checkpoint_and_dispatch?

1 Like

I got the same issue
In my case there are some submodule in model not initialized from checkpoint, del these submodule before calling load_checkpoint_and_dispatch, works

Hi everyone !
If you want to better understand how big model inference works and why we are using meta device, I invite you to read the following doc. This should help answering most of the questions here.
If you want to load a model from transformers, I would suggest to use accelerate integration in transformers (use device_map in from_pretrained). Otherwise, the following snippet works also but you may encounter some issues as it is not the indented use:

from accelerate import init_empty_weights
from transformers import AutoModelForCausalLM, AutoConfig, AutoTokenizer
from accelerate import load_checkpoint_and_dispatch
from huggingface_hub import snapshot_download

config = AutoConfig.from_pretrained("EleutherAI/gpt-neo-2.7B")
weights = snapshot_download("EleutherAI/gpt-neo-2.7B")
with init_empty_weights():
    model = AutoModelForCausalLM.from_config(config)
# needed for transformers model 
model.tie_weights()

# need to set no`_split_module_classes`
model = load_checkpoint_and_dispatch(
    model, weights, device_map="auto", no_split_module_classes =["GPTNeoBlock"]
)
tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neo-2.7B")
prompts = ["I would like to"]
token_dict = tokenizer(prompts, return_tensors="pt").to(1)
output_ids = model.generate(**token_dict, max_new_tokens=20)
print(tokenizer.batch_decode(output_ids))
1 Like