Trainer API for Model Parallelism on Multiple GPUs

It depends on how you launch the script. If you use torch.distributed.launch (or have accelerate config setup for multi-gpu) it’ll use DistributedDataParallism. To use model parallelism just launch with python {myscript.py} and it should pick up model parallism. (If you find it does not, or need some more assistance, let me know!)

You can verify if so by checking if trainer.args.parallel_mode prints ParallelMode.NOT_DISTRIBUTED.

1 Like