How to use specific gpu in accelerate?

nastyhilda · November 20, 2023, 5:42am

I want to use GPUs with different conditions.
But I think the accelerator.device() is always cuda:0. even I wanted to rewrite it like cuda:1 or cuda:2 but it couldn’t be modified. How to fix this problem?

muellerzr · November 20, 2023, 3:02pm

You can specify either CUDA_VISIBLE_DEVICES or use the --gpu_ids param to either your config or accelerate launch

hanjohn · November 21, 2023, 5:27am

I’m a bit confused by your response. In fact, I already added the following code at the beginning of my file, specifying GPU number 6:

os.environ['CUDA_VISIBLE_DEVICES'] = str(6)
device = torch.device(f"cuda:{args.gpu}" if torch.cuda.is_available() else "cpu")

However, after using accelerator = Accelerator() , when I check accelerator.device , it still shows device(type='cuda') and remains on device 0. If I manually set accelerator.device = self.device , it throws an error: AttributeError: can't set attribute . Could you please provide more detailed instructions on how to specify the GPU using the Accelerator? I would greatly appreciate it.

nastyhilda · November 21, 2023, 5:57am

exactly the same situation!!

muellerzr · November 21, 2023, 1:56pm

You cannot do this in your python file like that, this has to be done before your python file has been called, or before torch/accelerate/anything that init’s the GPU has been imported (possibly).

So solutions:

accelerate launch --gpu_ids 6 myscript.py

CUDA_VISIBLE_DEVICES=6 python myscript.py

(Do not know if this last solution will actually work, I haven’t tried it before)

import os
os.environ["CUDA_VISIBLE_DEVICES"] = str(6)
import torch
...

hanjohn · November 21, 2023, 2:59pm

Thank you for your prompt reply. However, when following your guidance I encountered the following error:

RuntimeError: CUDA error: invalid device ordinal
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.

nastyhilda · November 22, 2023, 3:09am

Thank you I solved it

hanjohn · November 22, 2023, 4:57am

Hi, Could you please explain in more detail how you solved this problem? I’m interested to understand the steps you took to resolve it.

nastyhilda · November 22, 2023, 5:14am

When changing your GPU through os.environ[‘CUDA_VISIBLE_DEVICES’], it is important that the env code should come first before all import sections. I think I failed when I tried to change GPU after all the other Accelerate and torch or tensorflow modules had been imported, suggesting that the problem of the ordinal environment codes order is the biggest.

hanjohn · November 23, 2023, 12:49am

Thank you I solved it

cs-mshah · April 25, 2024, 7:33pm

None of this worked for me. Everywhere I added CUDA_VISIBLE_DEVICES=4 but somehow it is taking 0. Nowhere in the code have I specified 0. This is the crappiest thing I’ve ever experienced. Wasted 1 hour and still not resolved. Kindly change the api to something simpler and respect the environment variables. Here is the code I’m using: GitHub - thuanz123/realfill

Topic		Replies	Views
How to use specified GPUs with Accelerator to train the model? Beginners	15	29474	August 23, 2024
Use CUDA_VISIBLE_DEVICES with accelarator 🤗Accelerate	1	1172	August 30, 2021
[SOLVED] accelerate.Accelerator(): CUDA error: invalid device ordinal 🤗Accelerate	11	10178	July 6, 2024
Can't set attribute 'device', for some reason i need to train model on only one gpu on a mutlti gpu machine Beginners	2	410	September 16, 2024
Accelerate doesn't seem to use my GPU? 🤗Accelerate	7	5752	September 18, 2024

How to use specific gpu in accelerate?

Related topics