Why target is noise data when calculating loss?

saeu5407 · December 5, 2023, 5:54am

I’m studying the learning process. It looks like model_pred is the result of dinoizing latent with unet.

I don’t know why loss compare this result to noise. Shouldn’t we compare it with the latent encoded by VAE encoder?

And what role is elif’s noise_scheduler.get_velocity?

# Get the target for loss depending on the prediction type
if noise_scheduler.config.prediction_type == "epsilon":
    target = noise
elif noise_scheduler.config.prediction_type == "v_prediction":
    target = noise_scheduler.get_velocity(latents, noise, timesteps)
else:
    raise ValueError(f"Unknown prediction type {noise_scheduler.config.prediction_type}")

# Predict the noise residual and compute loss
'''이게 디노이징이면 타겟은 노이즈 먹이기 전 edit_image를 vae로 인코딩한 값이어야 하지 않는가'''
model_pred = unet(noisy_latents, timesteps, encoder_hidden_states).sample
loss = F.mse_loss(model_pred.float(), target.float(), reduction="mean")

mrmiller · February 8, 2024, 1:41am

That noise is based on the latent image.

sh-lee · April 18, 2024, 4:30am

Is noise based on the latent image, right?
Becase above some lines, noise is set to random noise.
noise = torch.randn_like(latents)

mrmiller · April 18, 2024, 5:10am

The random noise is added to the image in img2img.

Topic		Replies	Views
How to set target when noise_scheduler.config.prediction_type == "v_prediction"? 🧨 Diffusers	0	556	December 11, 2023
Why is the loss of Diffusion model calculated between "RANDOM noise" and "model predicted noise"? Not between "Actual added noise" and "model predicted noise"? 🧨 Diffusers	12	5417	November 27, 2023
DDIM v prediction problem 🧨 Diffusers	1	3299	September 5, 2023
Could there be an "remove noise" function to remove noise from noisy_latents, given the noise and the timestep? 🧨 Diffusers	0	440	August 3, 2023
Question regarding T5ForConditionalGeneraton loss in the example Beginners	0	328	January 4, 2021

Why target is noise data when calculating loss?

Related topics