Trade offs when upscale an image

xxmtg4 · December 20, 2023, 5:20am

Hey guys, I am looking for a method to up-scale an image and have implemented some codes.
I want to share about the results and findings
Hopefully we can improve the design a little bit and come up some new ideas.

Test base image (512x512):

Test steps:

Upscale image to (2048x2048) using R-ESRGANx4
Split image into tiles with size of 512x512, each tile overlaps with others by N pixels, where N is a parameters.
Run diffuser pipeline img2img with same prompt that generates the base image
Assemble the re-paint tiles and do a weighted addition between tiles.

if region of tile is overlapping:
   do a weighted addition and put the sum on canvas
else
   put the region on canvas

Output the final image.

Time performance:
Euler-A is always faster with low number of iterations and reaches the same quality. I think it can go as low as 25 iterations. But I use 35 for best result.

Memory performance:
Both schedulers consume almost same GPU memory (about 6GB total in my case). Tiled diffusion is almost a must have, or my GPU always reports out of memory. So far, I never find any good approaches that can directly upscale an image from 512x512 to larger size. If you know any good methods, please share it.

Findings and issues:

A good upscaler is a good beginning. This is actually very important. Since it is a trade-off between noise strength and consistency between tiles. If a poor up-scaler is used with a low noise strength, the final image will become blurry since it is “mimicking” the blurriness from the original image. However, if a high noise strength is used, the image does not look like the original image anymore.
Overlapping size and ghost image. If you look at the test results, specifically near the trees or highly detailed region, you will find that there are ghost “regions”. This is because that it is a overlapping region and I am doing a weighted alpha sum here. The larger the overlapping size, the subjectively smoother the image will look like. However, it will increase the ghost region size as well. This is the issue that I am trying to resolve, and I did not find any related discussions.
Is there any other methods that we can implement to compare the result? I know that web-ui has a “high res fix” implementation. However, after checking the code, it seems like a tiled diffusion based method. This post does diffusers have the equivalent to hires fix from A1111? · Issue #3429 · huggingface/diffusers · GitHub mentioned that there is a latent space approach. However, I did not find anymore discussion about it.

xxmtg4 · December 20, 2023, 5:20am

Test results (2048x2048):

Tiled diffusion with DDIM, 45 iterations, overlapping 48 pixels.

1703046241.256251-test1920×1920 204 KB

xxmtg4 · December 20, 2023, 5:21am

Tiled diffusion with Euler-A, 35 iterations, overlapping 48 pixels.

1703046712.089393-test1920×1920 209 KB

xxmtg4 · December 20, 2023, 5:22am

Tiled diffusion with Euler-A, 35 iterations, overlapping 64 pixels.

1703047114.0242655-test1920×1920 206 KB

Topic		Replies	Views
Help implementing Tiled Diffusion and Tiled VAE with Diffusers Beginners	3	172	April 14, 2025
Upscaling an Anime Image using Diffusers 🧨 Diffusers	7	1655	September 9, 2024
Decoding latents to RGB without upscaling 🧨 Diffusers	12	11534	April 23, 2023
Mini Stable Diffusion. How-to guide? Beginners	0	1361	February 15, 2023
Stable diffusion img2img: Continue from saved image 🧨 Diffusers	3	3274	September 23, 2022

Trade offs when upscale an image

Related topics