Hello,
I can successfully run the 30B meta model on one node (following load_checkpoint_and_dispatch "Expected all tensors to be on the same device" for > 1 GPU devices · Issue #362 · huggingface/accelerate · GitHub). Now I was curious if I can run the same on two nodes to prepare for even larger models. I ran “accelerate config” and “accelerate launch my_script.py” on both nodes, but it seems that the model is just completely loaded on each of the two nodes.