Accelerator.backward freeze

lucky-lzh · February 23, 2025, 12:51pm

        if image_list is not None:
            self.accelerator.backward(loss_ce+loss_diff)
            loss = loss_ce.detach()+loss_diff.detach()
        else:
            self.accelerator.backward(loss_ce)
            loss = loss_ce.detach()

I’m using multi-GPU training, and the code above causes the backward computation graphs on different GPUs to be different, leading to deadlocks. Is there any way to solve this issue?

John6666 · February 24, 2025, 8:04am

I think this is a similar problem, but it might be more reliable to raise an issue on the github page for the accelerate library…

Topic		Replies	Views
Question about calculating training loss of multi-GPU with Accelerate 🤗Accelerate	1	862	July 20, 2024
How to collect the accuracy when running multi GPU model with accelerate? 🤗Accelerate	3	979	December 8, 2023
Worse performance using Accelerate 🤗Accelerate	0	1049	January 15, 2024
Accelerator.backward(loss) never done! 🤗Accelerate	3	1564	March 9, 2023
Issue with accelerator.backward(loss) freezing 🤗Accelerate	0	530	January 6, 2024

Accelerator.backward freeze

Related topics