Hi All,
I’m trying to run Google Colab but I’m having issues resuming from checkpoints.
I added this code in my training function
# Resuming from check point
if args.resume_from_checkpoint:
if args.resume_from_checkpoint != "latest":
path = os.path.basename(args.resume_from_checkpoint)
else:
# Get the mos recent checkpoint
dirs = os.listdir(args.output_dir)
dirs = [d for d in dirs if d.startswith("checkpoint")]
dirs = sorted(dirs, key=lambda x: int(x.split("-")[1]))
path = dirs[-1]
accelerator.print(f"Resuming from checkpoint {path}")
accelerator.load_state(os.path.join(args.output_dir, path + "/unet"))
global_step = int(path.split("-")[1])
but I get this error:
[Errno 2] No such file or directory: ‘dreambooth-concept/checkpoint-2000/pytorch_model.bin’
Checking the folder I notice that there’s not any file .bin, if I go then to unet subfolder of my checkpoint folder I see this file: diffusion_pytorch_model.bin.
Is this what is looking for?
thanks!