Training using multiple GPUs

Regarding training models using multiple GPUs, refer to the Alignment Handbook which uses DeepSpeed ZeRO-3 to run training on multiple GPUs: alignment-handbook/scripts at main · huggingface/alignment-handbook · GitHub.

This is handled using the Accelerate library as backend (which the Trainer uses). One needs to define a configuration as done here: alignment-handbook/recipes/accelerate_configs/deepspeed_zero3.yaml at main · huggingface/alignment-handbook · GitHub, and then pass that when running the script.