I’m trying to set the initial latents of the StableDiffusionInpaintPipeline to the original image.
This is what I tried:
# load the pipeline pipe = StableDiffusionInpaintPipeline.from_pretrained( "runwayml/stable-diffusion-inpainting", torch_dtype=torch.float16 ) pipe = pipe.to("cuda") # convert the image to a tensor and add a batch dimension image_tensor = pil_to_tensor(init_image).to(torch.float16).unsqueeze(0) # encode the image into its latent space using the VAE with torch.no_grad(): latents = pipe.vae.encode(image_tensor.cuda()) tensor_image = latents.latent_dist.sample() image = pipe(prompt="", image=init_image, mask_image=fg_mask, latents=tensor_image).images
However the results look very wrong, the general image is recognizable but it is almost like latent noise and it is also changed where the mask is 0.
What am I missing?