Img2img inpaint + controlnet openpose

Hi there,

I am trying to create a workflow with these inputs:

  • prompt
  • image
  • mask_image
  • use ControlNet openpose

It needs to persist the masked part of the input image and generate new content around the masked area to fit in.

I tried to use StableDiffusionControlNetInpaintPipeline with lllyasviel/control_v11p_sd15_openpose.

vae = AutoencoderKL.from_pretrained("stabilityai/sd-vae-ft-mse")
controlnet = ControlNetModel.from_pretrained(
    "lllyasviel/control_v11p_sd15_openpose", torch_dtype=torch.float16
)
pipe = StableDiffusionControlNetInpaintPipeline.from_pretrained(
    #"runwayml/stable-diffusion-v1-5",
    "SG161222/Realistic_Vision_V5.1_noVAE", 
    controlnet=controlnet,
    vae=vae,
    safety_checker=None,
    torch_dtype=torch.float16
).to("cuda")

then:

        self.pipe = StableDiffusionControlNetInpaintPipeline.from_pretrained(
            CACHE_DIR, torch_dtype=torch.float16
        ).to("cuda")
        self.pipe.scheduler = EulerDiscreteScheduler.from_config(
            self.pipe.scheduler.config
        )
        self.pipe.enable_xformers_memory_efficient_attention()

        image = image.resize((512,512))
        mask_image = mask_image.resize((512,512))
        
        print(image)
        print(mask_image)
        out = self.pipe(
            prompt=prompt,
            negative_prompt=negative_prompt,
            image=image,
            mask_image=mask_image,
            width=width,
            height=height,
            guidance_scale=guidance_scale,
            generator=torch.Generator().manual_seed(seed),
            num_inference_steps=num_inference_steps,
        )

The image seems to be correct when printing then it is:

<PIL.Image.Image image mode=RGB size=512x512 at 0x7FE43E194E50>

But self.pipe() gives error:

TypeError: image must be passed and be one of PIL image, numpy array, torch tensor, list of PIL images, list of numpy arrays or list of torch tensors, but is <class 'NoneType'>