Training from scratch

damian0815 · April 8, 2023, 2:21pm

this isn’t training from scratch, but from hacking around and experimenting with the EveryDream2 stable diffusion fine-tuner i was able to make a fairly useful and reliable loss graph by holding the noise seed fixed when running a (fixed-set/sequence) validation pass.

the intuition is that because the diffusion process relies on noise so heavily, variance in that noise between validation passes tends to overwhelm the relatively small signal of decreasing loss. to correct for this, re-seed the noise to the same seed every time you do a validation pass (i used isolate_rng() context manager to prevent also re-seeding the train RNG, iirc it’s in pytorch lightning). you’re still at the mercy of whatever sequence of noises that particular seed used for validation gives you, but you should find you have a loss curve that traces a more clearly decreasing trajectory (even if it’s just a small one).

fwiw, contra to the link @sayakpaul provided, this loss curve is informative - it pretty reliably indicates when fine-tuning loss has reached a minima, and can be trusted to start to trend upward in a way that’s reflective of the model overfitting the training data.

example: https://huggingface.co/damian0815/pashahlis-val-test-1e-6-ep30

i’m surprised no other stable diffusion fine-tuners have implemented this. also a bit suspicious…

Topic		Replies	Views
Loss drops normally but stops improving quickly 🧨 Diffusers	3	5655	March 9, 2023
Tiny fine tune messed up the pretrained sd v1-4 model 🧨 Diffusers	2	514	April 8, 2023
Loss doesn't converge for latent diffusion model 🧨 Diffusers	0	1199	September 25, 2023
Training loss does not go down during fine-tuning Beginners	0	1810	July 3, 2023
A question about finetuning 🧨 Diffusers	0	274	April 11, 2023

Training from scratch

Related topics