Multiple GPU in SFTTrainer

How can i use SFTTrainer to leverage all GPUs automatically? If I add device_map=“auto” I get a Cuda out of memory exception. I although I have 4x Nvidia T4 GPUs

Cuda is installed and my environment can see the available GPUs.


I would recommend to take a look at the example scripts here: alignment-handbook/scripts at main · huggingface/alignment-handbook · GitHub. It includes scripts which can be run with ZeRO-3 on 8 GPUs. For that, they define an Accelerate config. Basically you need to run the script with accelerate launch --config_file recipes/accelerate_configs/deepspeed_zero3.yaml (in the YAML you define on what hardware the script should be run).

The problem is I need to run it via python because I’m using vertex ai pipelines for MLops. Why can’t I avoid calling accelerate on my script via bash?

Is using Optimum a better solution if I want to run it via python without cli launch like you mentioned?