Diffusers text-to-image finetuning example fails on multi-node

Hi @j-min! Would you mind opening an issue in the diffusers repo so the team can look into this?

Thanks a lot!