Model never saved

I followed Train a diffusion model, and it all finished well, yet, the final repo does not look like it should (like anton-l/ddpm-butterflies-128/tree/main)
It looks like this TukuToi/ddpm-butterflies-128/tree/main

As you can see no unet, no model_index, etc
I also checked the folder in the Colaboratory and they do not feature the unet either.

What is wrong? Where is the trained model gone to?
The cell finished with below lines, and indicates to be done in the progress indicator:

Adding files tracked by Git LFS: ['samples/0039.png', 'samples/0049.png']. This may take a bit of time if the files are large.
WARNING:huggingface_hub.repository:Adding files tracked by Git LFS: ['samples/0039.png', 'samples/0049.png']. This may take a bit of time if the files are large.
To https://huggingface.co/TukuToi/ddpm-butterflies-128
   2cd00c5..246df76  main -> main

WARNING:huggingface_hub.repository:To https://huggingface.co/TukuToi/ddpm-butterflies-128
   2cd00c5..246df76  main -> main

From the logs I gather it never even attempts to save it:

100%
1000/1000 [01:37<00:00, 10.20it/s]
Adding files tracked by Git LFS: ['samples/0000.png']. This may take a bit of time if the files are large.
Upload file samples/0000.png: 100%
526k/526k [00:01<00:00, 491kB/s]
Upload file logs/train_example/events.out.tfevents.1686132774.2b487d1c587c.748.1: 100%
7.71k/7.71k [00:01<?, ?B/s]
To https://huggingface.co/TukuToi/ddpm-butterflies-129
   722846e..b4db147  main -> main

Yet in the config, I set the save save_model_epochs to 1 to be sure it saves it each epoch.

This is really frustrating!

getting same stuck in it a week

1 Like

i tried this notebook but not genrated good images as sample in above nottebook

plz share sulution if you have solved .

1 Like

try this.

%pip install -qq -U diffusers datasets transformers accelerate ftfy pyarrow<18.0.0a0