Infer_auto_device_map returns empty

Following the instructions in this post to load the same opt 13b. I have access to 8 Nvidea A100 80GB machines.

model = AutoConfig.from_pretrained("facebook/opt-13b") runs successfully and the model card is available to use. Example of the model

  (model): OPTModel(
    (decoder): OPTDecoder(
      (embed_tokens): Embedding(50272, 5120, padding_idx=1)
      (embed_positions): OPTLearnedPositionalEmbedding(2050, 5120)
      (final_layer_norm): LayerNorm((5120,), eps=1e-05, elementwise_affine=True)
      (layers): ModuleList(

However, device_map = infer_auto_device_map(model) returns {'': 0} despite 8 GPUs being available.

The code is exactly the same as the blog post

from accelerate import infer_auto_device_map, init_empty_weights
from transformers import AutoConfig, AutoModelForCausalLM

config = AutoConfig.from_pretrained("facebook/opt-13b")
with init_empty_weights():
    model = AutoModelForCausalLM.from_config(config)

device_map = infer_auto_device_map(model)

Not sure why this is the case.
Any suggestions/help appreciated. Thanks!

Your model fits on the whole GPU, so infer_auto_device_map just returns that ({'': 0"} means put everything on GPU 0). You can pass along a max_memory argument if you want to limit how much memory to use on each GPU.


I found that Auto classes sometimes cause this {'': 0} issue. In my case, I’m using T5-large for inference, and I found infer_auto_device_map() would return the device map properly if I load the model with T5ForConditionalGeneration class but {'': 0} if loaded with AutoModelForSeq2SeqLM. However, Auto classes are not always like that. For example, infer_auto_device_map() works fine with google/flan_ul2 loaded with AutoModelForSeq2SeqLM