StableDiffusionInpaintPipeline Tensor Input Error

When calling a StableDiffusionInpaintPipeline pipeline, the documentation (Stable diffusion pipelines) states that tensors can be used as input for parameters image and mask_image. The docs indicate that mask_image needs to have shape (B, H, W, 1), but what is the expected shape for image?

I used (B, H, W, 1) for both image and mask_image parameters, and got this error:

----> Preformatted text4 image = pipe(prompt=inpaint_prompts[0], image=inpaint_input, mask_image=inpaint_mask).images[0]
6 image

2 frames
/usr/local/lib/python3.7/dist-packages/torch/autograd/ in decorate_context(*args, **kwargs)
25 def decorate_context(*args, **kwargs):
26 with self.clone():
—> 27 return func(*args, **kwargs)
28 return cast(F, decorate_context)

/usr/local/lib/python3.7/dist-packages/diffusers/pipelines/stable_diffusion/ in call(self, prompt, image, mask_image, height, width, num_inference_steps, guidance_scale, negative_prompt, num_images_per_prompt, eta, generator, latents, output_type, return_dict, callback, callback_steps, **kwargs)
362 # prepare mask and masked_image
→ 363 mask, masked_image = prepare_mask_and_masked_image(image, mask_image)
364 mask =, dtype=text_embeddings.dtype)
365 masked_image =, dtype=text_embeddings.dtype)

/usr/local/lib/python3.7/dist-packages/diffusers/pipelines/stable_diffusion/ in prepare_mask_and_masked_image(image, mask)
22 def prepare_mask_and_masked_image(image, mask):
—> 23 image = np.array(image.convert(“RGB”))
24 image = image[None].transpose(0, 3, 1, 2)
25 image = torch.from_numpy(image).to(dtype=torch.float32) / 127.5 - 1.0

AttributeError: ‘Tensor’ object has no attribute ‘convert’

Are tensors not actually supported as inputs?

Hi @jnick!

You are right, tensors are not supported yet but that’s about to be fixed, see this pull request.

1 Like

That’s great! Thank you @pcuenq!