Multiple GPUs are being used despite `--num_processes 1`

seanswyi · July 31, 2024, 5:35am

I’m launching my script via accelerate launch --num_processes 1 train.py. However, the following code is printing the following output:

# Code
logger.warning(
        "Process rank: %s, device: %s, n_gpu: %s, distributed training: %s, 16-bits training: %s",
        training_args.local_rank,
        training_args.device,
        training_args.n_gpu,
        bool(training_args.local_rank != -1),
        training_args.fp16,
)

# Output
Process rank: 0, device: cuda:0, n_gpu: 4, distributed training: True, 16-bits training: True

Why is n_gpu 4, and why is the value of local_rank not being set to the default value of -1?

The code runs fine when I set CUDA_VISIBLE_DEVICES=0, but I would think that HF Accelerate would handle this without having to set environment variables.

Topic		Replies	Views
HF Accelerate uses multiple GPUs even when setting `num_processes` to 1 🤗Accelerate	0	86	August 2, 2024
`num_processes == 1` even when I set it to `--num_processes 2` 🤗Accelerate	5	3306	May 18, 2023
How to restrict training to one GPU if multiple are available, co 🤗Transformers	4	14350	November 1, 2023
Multi-GPU Training using Accelerate: RAM Issue Leading to Failure 🤗Accelerate	0	94	July 16, 2024
Training on multiple GPUs with multi file script 🤗Accelerate	0	511	October 16, 2023

Multiple GPUs are being used despite `--num_processes 1`

Related topics