Decoding latents to RGB without upscaling

I fit them against some grayscale images, some bright saturated colors like the hot air balloon example above, and some mid-tones. The result seems close enough for the images I’ve seen come out of it. Sometimes the color accuracy is worse than others, but it’s sufficient if you’re just trying to get a rough idea of the composition.

I’ve posted some demo code for both gradio and ipywidgets: GitHub - keturn/sd-progress-demo: cheap views of intermediate Stable Diffusion results

I’m sure it would have gone better if I had any idea what I was doing. :laughing: But this is all my first project involving pytorch and neural networks. I have some more catching up to do on the fundamentals before I can even read how the decoder is put together, let alone understand how to modify it.

2 Likes