Debugging accelerate processes on remote node(s)

I’m running into an issue where the subprocess running on my second node is timing out after 900 seconds and I’m trying to figure out how to debug the processes launched by pdsh on the remote node. I have found some information on how to attach pdb to remote processes that have already launched, but I’m wondering if anyone working on accelerate has a way they like to do it rather than have to go through half a dozen examples drug up from StackOverflow until I find one that works.

Seems to be related to this issue

It’s been a couple of weeks since I raised the issue and I’m not hearing anything back so I wondered if anyone here might be able to help.