Main code executed twice per process. Normal behaviour?

Hello everyone. I am just getting started with accelerate and distributed training in general. To test the number of GPUs used I created a simple script containing a simple main function:

def main():
    deepspeed_plugin = DeepSpeedPlugin(zero_stage=3, gradient_accumulation_steps=1)
    accelerator = Accelerator(fp16=True, deepspeed_plugin=deepspeed_plugin)

    print(f'Num Processes: {accelerator.num_processes}; Device: {accelerator.device}; Process Index: {accelerator.process_index}')

I launch this with CUDA_VISIBLE_DEVICES=0,1 accelerate launch --config_file accelerate_config.yaml I get the following output:

Num Processes: 2; Device: cuda:0; Process Index: 0
Num Processes: 2; Device: cuda:0; Process Index: 0
Num Processes: 2; Device: cuda:1; Process Index: 1
Num Processes: 2; Device: cuda:1; Process Index: 1

It seems like main is executed twice for each process. My question is whether this is expected behaviour?

accelerate_config.yaml contains:

compute_environment: LOCAL_MACHINE
  gradient_accumulation_steps: 1
  offload_optimizer_device: cpu
  zero_stage: 3
distributed_type: DEEPSPEED
fp16: false
machine_rank: 0
main_process_ip: null
main_process_port: null
main_training_function: main
num_machines: 1
num_processes: 2

Thank you.

This is very weird, it’s not supposed to happen indeed, you should only see two print statements.

Hmm yeah. Not really sure how to debug what is happening under the hood. I use Pytorch 1.8.0 and I am running the newest versions of both accelerate and deepspeed. Is this something you can reproduce on your end as well or could it be related to my setup?

Ah I rechecked and figured it out. I made a stupid formatting error and main was accidentally called twice in my script. Everything works as expected. Sorry for the inconveniences.

1 Like