Multinode FSDP not working

I have a setup where I need to load 2 7B llama models (1 reward model and 1 SFT model). By default when I use accelerate (with accelerate config or with torchrun), it tries to load both the models fully on the same node despite having multiple nodes available and eventually crashes with out of memory. I wanted to know if it’s possible to load even a single model sharded across multiple nodes using FSDP. If that’s not possible, how do I go about loading multiple models in a multinode setting using FSDP so that it does not go out of memory? Any ideas would be appreciated.