ValueError: weight is on the meta device when using Auto Model For Sequence Classification

Trying to load Llama for classification, I get a ‘weight is on the meta device’ error.
Code:

model = AutoModelForSequenceClassification.from_pretrained('decapoda-research/llama-7b-hf', torch_dtype=torch.float16, device_map="auto", num_labels=3)

I get the error:

model = AutoModelForSequenceClassification.from_pretrained( File "/home/lab/user/anaconda3/envs/project/lib/python3.9/site-packages/transformers/models/auto/auto_factory.py",   line 471,  
 in from_pretrained return model_class.from_pretrained( File "/home/lab/user/anaconda3/envs/project/lib/python3.9/site-packages/transformers/modeling_utils.py", line 2846,   
in from_pretrained dispatch_model(model, device_map=device_map, offload_dir=offload_folder, offload_index=offload_index)   
File "/home/lab/user/anaconda3/envs/project/lib/python3.9/site-packages/accelerate/big_modeling.py", line 396,   
in dispatch_model attach_align_device_hook_on_blocks(   
File "/home/lab/user/anaconda3/envs/project/lib/python3.9/site-packages/accelerate/hooks.py", line 537,  
 in attach_align_device_hook_on_blocks attach_align_device_hook_on_blocks( File   "/home/lab/user/anaconda3/envs/project/lib/python3.9/site-packages/accelerate/hooks.py", line 507,  
 in attach_align_device_hook_on_blocks add_hook_to_module(module, hook) File  
 "/home/lab/user/anaconda3/envs/project/lib/python3.9/site-packages/accelerate/hooks.py", line 155,  
 in add_hook_to_module module = hook.init_hook(module) File  
 "/home/lab/user/anaconda3/envs/project/lib/python3.9/site-packages/accelerate/hooks.py", line 253,   
in init_hook set_module_tensor_to_device(module, name, self.execution_device) File "/home/lab/user/anaconda3/envs/project/lib/python3.9/site-packages/accelerate/utils/modeling.py", line 281,   
in set_module_tensor_to_device raise ValueError(f"{tensor_name} is on the meta device, we need a valueto put in on {device}.") ValueError: weight is on the meta device, we need avalue to put in on 6.

Importantly, loading the model with AutoModelForCausalLM instead of AutoModelForSequenceClassification works.
It seems as if AutoModelForSequenceClassification creates the classification layers empty on the meta device, and then crushes when trying to move them to GPU.
I tried loading everything directly into memory, without meta device, but couldn’t.

A workaround I came with is loading the classification model to cpu and then saving it once using accelerator.save_model. Afterwards the model can be loaded using load_checkpoint_and_dispatch

Hi @Glick, could you try with this model instead meta-llama/Llama-2-7b-hf ?
I’ve tested it on the latest transformers and I am unable to reproduce the issue.

from transformers import AutoModelForSequenceClassification
import torch
model = AutoModelForSequenceClassification.from_pretrained('meta-llama/Llama-2-7b-hf', 
                                                           torch_dtype=torch.float16, 
                                                           device_map='auto', 
                                                           num_labels=3)