Hello,
I’m trying to set the initial latents of the StableDiffusionInpaintPipeline to the original image.
This is what I tried:
# load the pipeline
pipe = StableDiffusionInpaintPipeline.from_pretrained(
"runwayml/stable-diffusion-inpainting",
torch_dtype=torch.float16
)
pipe = pipe.to("cuda")
# convert the image to a tensor and add a batch dimension
image_tensor = pil_to_tensor(init_image).to(torch.float16).unsqueeze(0)
# encode the image into its latent space using the VAE
with torch.no_grad():
latents = pipe.vae.encode(image_tensor.cuda())
tensor_image = latents.latent_dist.sample()
image = pipe(prompt="",
image=init_image,
mask_image=fg_mask,
latents=tensor_image).images[0]
However the results look very wrong, the general image is recognizable but it is almost like latent noise and it is also changed where the mask is 0.
What am I missing?