Multiple GPU in SFTTrainer

Milku · June 13, 2024, 6:40am

How can i use SFTTrainer to leverage all GPUs automatically? If I add device_map=“auto” I get a Cuda out of memory exception. I although I have 4x Nvidia T4 GPUs

Cuda is installed and my environment can see the available GPUs.

nielsr · June 13, 2024, 7:32am

Hi,

I would recommend to take a look at the example scripts here: alignment-handbook/scripts at main · huggingface/alignment-handbook · GitHub. It includes scripts which can be run with ZeRO-3 on 8 GPUs. For that, they define an Accelerate config. Basically you need to run the script with accelerate launch --config_file recipes/accelerate_configs/deepspeed_zero3.yaml (in the YAML you define on what hardware the script should be run).

Milku · June 15, 2024, 2:02pm

The problem is I need to run it via python because I’m using vertex ai pipelines for MLops. Why can’t I avoid calling accelerate on my script via bash?

Milku · June 28, 2024, 6:00am

Is using Optimum a better solution if I want to run it via python without cli launch like you mentioned?

saeed899 · December 27, 2024, 2:20pm

Same issue.
@nielsr What when we have only python notebook access?

Topic		Replies	Views
HF Accelerate uses multiple GPUs even when setting `num_processes` to 1 🤗Accelerate	0	72	August 2, 2024
Loading a HF Model in Multiple GPUs and Run Inferences in those GPUs 🤗Accelerate	10	9587	October 16, 2024
Cannot use multiple GPUs Beginners	0	508	July 14, 2023
Accelerate throws CUDA: OOM 🤗Accelerate	0	418	August 22, 2024
`Accelerator.prepare` utilize only one GPU instead of all the 8 available GPUs and raises "CUDA out of memory" 🤗Accelerate	3	2849	July 19, 2024

Multiple GPU in SFTTrainer

Related topics