Saving checkpoint is too slow with deepspeed

Hi, I have the same problem with saving codet5-6b with zero3. The logs says pytorch_model.bin is saved but it is not there and the process hangs. Did you find a solution?