Best way to add conditions to diffusion model

I wish to add a non-textual condition to the UNet and fine tune stable diffusion. What is the best way to do so? One way I am currently thinking of is to add it to the time embedding and pass it through the UNet. Wondering if there was a better way.

While I’m at it, how can I modify the time embedding? In UNet2DConditionModel there doesn’t seem to be a straightforward way of doing so.