Does ControlNet (and other diffusers) only include 1 noise injection per iteration in training loop?

offchan · May 21, 2023, 5:10am

Yes, you are understanding it correctly.
Why is it done like this? Here is my reasoning:

When randomizing the noise level (the timesteps value), it forces the model to work with all kinds of noise levels. In theory, the timesteps value doesn’t even need to be an int. It can be a float.So if you are trying to cram all the noise levels into one batch, you will fail as you can’t exhaustively sample all floating point values. In current implementation, it’s an int value between 0 and 1000. But even with only 1000 levels, you can’t put 1000 samples into one batch, can you?
Because you can’t put all noise levels into one batch, you have to pick some levels. Let’s say you can put 24 samples in a batch, would you choose the first 24 timesteps or would you randomize the timesteps? Clearly the uniform randomization is better because it would yield a gradient estimate that is more representative of the whole noise range between 0 and 1000. It’s the same reason why we shuffle training samples when training.
Why not include 24 different images (each having one noise level) in a training batch, instead of including 24 noise levels from a single image? The first option obviously gives a better gradient estimate of the whole dataset.

In short, it’s all about estimating gradient better and thinking about trade-offs of what to put into the batch. Remember that training every noise level is costly and you don’t have unlimited batch space. We want gradients that are estimating the whole dataset, not a single image.

Topic		Replies	Views
Stable Diffusion ControlNet fine-tune quality issue 🧨 Diffusers	0	545	June 4, 2024
Multi_controlnet + inpaint 🧨 Diffusers	5	3579	November 12, 2023
How to train noise? (The model is frozen) 🧨 Diffusers	1	117	November 8, 2024
Getting ControlNet.from_pretrained working with ControlNet with more than 3 conditioning channels 🧨 Diffusers	0	433	August 23, 2023
How much does the initial noise influence the output quality? 🧨 Diffusers	2	215	September 8, 2024

Does ControlNet (and other diffusers) only include 1 noise injection per iteration in training loop?

Related topics