`accelerate config` alternative for multi-node training

cyk1337 · September 5, 2022, 2:06pm

For multi-node training, the accelerate library requires manually running accelerate config on each machine. It is inconvenient if the node number exceeds 10+ (manually setting the configuration for 10+ times). Is there a solution that we can automatically generate the config file on each machine?

muellerzr · September 5, 2022, 9:29pm

You can just use the same single yaml and tweak it for the node number, and copy/paste onto each machine.

cyk1337 · September 6, 2022, 2:04am

Hi @muellerzr, I was wondering whether there is such a tool. If not, I would develop one. Thank you very much for your answers.

Topic		Replies	Views
Accelerate Multi-GPU on several Nodes How to 🤗Accelerate	3	6256	October 13, 2021
Multi-node training 🤗Accelerate	2	2944	January 16, 2023
Detecting single gpu within each node 🤗Accelerate	2	757	January 17, 2023
Distributed GPU training not working 🤗Accelerate	2	4488	November 30, 2023
How to launch multi node training using accelerate launch 🤗Accelerate	0	634	May 13, 2024

`accelerate config` alternative for multi-node training

Related topics