Getting GPU info from Accelerate

aclifton314 · July 5, 2022, 6:46pm

I was wondering if it were possible to print out which GPU a model is being trained on when using accelerate as well as how many GPUs will be used for training?

muellerzr · July 6, 2022, 7:26pm

You should look at the output of accelerate env from the CLI. You can also configure this yourself by running accelerate config before training.

Otherwise it’s highly dependent on how you start the script. E.g. if you call torchrun itself and use accelerate in the script, it’ll use all of the GPUs available.

How are you calling your script?

aclifton314 · July 6, 2022, 7:49pm

@muellerzr Thank you for your response!

I call my script tmp.py like accelerate launch tmp.py

muellerzr · July 6, 2022, 7:53pm

Then it would be whatever accelerate env has configured If that hasn’t been configured yet then most likely its using all of your GPUs? (Though I think accelerate launch will give you an error if you haven’t configured it yet)

aclifton314 · July 6, 2022, 7:57pm

I did run accelerate config to set things up to use all 4 GPUs. Let me rephrase my question. I’d like to be able to log the total number of GPUs accelerate is using from within my python script tmp.py for informational/debugging purposes. Is it possible to get that information from the Accelerate object created in a python script?

muellerzr · July 6, 2022, 8:07pm

Yes, for that you’d want the following information in Accelerator:

Accelerator.num_processes and Accelerator.distributed_type.

You can also gather these from the AcceleratorState class by doing:

state = AcceleratorState()
num_devices, device_kind = state.num_processes, state.distributed_type

(this does the same thing, you should probably just grab them from the accelerator object you have made)

distributed_type will return a DistributedType, which is an enum. You can do str(device_kind) to get a string that looks like the following:

'DistributedType.MULTI_GPU'

aclifton314 · July 6, 2022, 8:27pm

That works like a charm!! Thank you!

Topic		Replies	Views
How to find the number of GPUs being used for training? 🤗Accelerate	1	5851	April 29, 2022
Accelerate doesn't seem to use my GPU? 🤗Accelerate	7	5701	September 18, 2024
What does "--multi_gpu" do under the hood? (and how to use it) 🤗Accelerate	7	6383	May 31, 2023
How to use specified GPUs with Accelerator to train the model? Beginners	15	29358	August 23, 2024
Training on multiple GPUs with multi file script 🤗Accelerate	0	508	October 16, 2023

Getting GPU info from Accelerate

Related topics