Trainer API for Model Parallelism on Multiple GPUs

Thanks for your reply! It is super helpful :slight_smile: It is great to know that by just running python {myscript.py} the class will use model parallelism.

A follow-up question from me is, how is the Trainer’s model parallelism differ from Deepspeed and FSDP? Is there any documentation that I can read into to gain more knowledge of what is happening at the backend?

Thanks a lot!