Runtime error when using device_map

Aisha · August 10, 2023, 3:53pm

Hi,

I am using transformers library to run inference on several models. I successfully got results running llama2-7b-chat-hf and open-llama-7b-v2 in a single GPU (model.to(device)). But to run 13b models I need to resort to device_map="auto". This leads to the following error:

RuntimeError: probability tensor contains either `inf`, `nan` or element < 0

I also tried 7b llama2 and open-llama with device_map=“auto”. I get the same error. I have seen a bunch of posts with the same error but people are getting it for various reasons that are not applicable to me. Would anyone be able to help me figure this out?

Here is how I am loading the model:

    tokenizer = AutoTokenizer.from_pretrained(
        model_name,
        add_prefix_space=True,
        use_auth_token=hg_token,
        padding_side="left",
        legacy=False,
    )

    model = AutoModelForCausalLM.from_pretrained(
        model_name,
        use_auth_token=hg_token,
        device_map="balanced_low_0"
    )

    model.eval()

    tokenizer.pad_token = tokenizer.eos_token
    model.config.pad_token_id = model.config.eos_token_id

lava18 · September 20, 2023, 10:55pm

I am facing the same error. Were you able to fix it?

Topic		Replies	Views
AutoModelforCausalLM fails only on Cuda due to inf/nan/<0 tensors 🤗Transformers	4	203	April 8, 2025
Smolagents Error: probability tensor contains either `inf`, `nan` or element < 0 Beginners	5	637	April 4, 2025
CUDA error: device-side assert triggered on device_map="auto" 🤗Transformers	4	1639	December 8, 2024
Device_map="auto" with error: Expected all tensors to be on the same device Beginners	7	6757	January 5, 2025
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! 🤗Transformers	28	113942	November 17, 2024

Runtime error when using device_map

Related topics