Accelerate is out of RAM

Hi, I have an instance of 8x A100s, 1.1 TB RAM however, accelerate launch isn’t able to run scripts on all 8 GPUs, only being able to handle 6 Processes.

accelerate launch --multi_gpu --mixed_precision=fp16 --num_processes=6 \
scripts/torch_convnext.py \
--model_name='convnext_large' --batch_size=64 --epochs=10 \
--lr=6e-5 --pretrained='imagenet' --optimize='AdamW' 

If I push num_processes > 6, then I get subprocess ‘Killed’ error, indicating all RAM has been used up.

Any way I can utilize all 8 of my GPUs and prevent RAM overflowing? :pray:

Code: accelerator = Accelerator(log_with='wandb')

Used CUDA 11.7 w/ PyTorch. Since the nightly isn’t out yet, I used the official Nvidia Docker image