Thanks for the detailed explanation
Under the hood, accelerate launch uses the deepspeed launcher. Same for pytorch distributed.
Ah, that clears up a lot for me. For ‘Same for pytorch distributed.’, does that mean accelerate launch uses pytorch distributed? Or pytorch distributed uses the deepspeed launcher?