Training from scratch

sayakpaul · April 6, 2023, 3:24am

The loss surface when training diffusion models is quite uninformative IMO. A good analysis of this is available here: [2303.09556] Efficient Diffusion Training via Min-SNR Weighting Strategy

On the other hand, the smaller dataset is obviously able to run more epochs in the same number of steps, so I guess it benefits more from seeing the same data again?

I’d think so but have you checked if the model overfits the data too quickly if you do that. This is something we have continuously observed in our experiments. Cc: @pcuenq @valhalla

Topic		Replies	Views
Loss drops normally but stops improving quickly 🧨 Diffusers	3	5655	March 9, 2023
Tiny fine tune messed up the pretrained sd v1-4 model 🧨 Diffusers	2	514	April 8, 2023
Loss doesn't converge for latent diffusion model 🧨 Diffusers	0	1199	September 25, 2023
Training loss does not go down during fine-tuning Beginners	0	1810	July 3, 2023
A question about finetuning 🧨 Diffusers	0	274	April 11, 2023

Training from scratch

Related topics