I’m encountering a persistent issue when running the StableDiffusionInpaintPipeline
for an inpainting task. Despite passing inputs in the expected formats (both the image and mask are in PIL.Image.Image
format with correct sizes), I keep receiving the following error:
ValueError: Input is in incorrect format. Currently, we only support <class 'PIL.Image.Image'>, <class 'numpy.ndarray'>, <class 'torch.Tensor'>
Here’s the code that triggers the error:
# Image and mask setup
image_pil = Image.fromarray(image_np)
mask_pil = Image.fromarray(black_mask).convert("L")
# Generator for reproducibility
generator = torch.Generator(device="cuda").manual_seed(0)
image = model["pipeline"](
prompt=prompt,
negative_prompt=IMG_INPAINTING_NEG_PROMPT,
image=image_pil, # PIL Image
mask=mask_pil, # Grayscale mask (mode "L")
guidance_scale=8.0,
num_inference_steps=50,
generator=generator,
).images[0]
Image and Mask Details:
Image size : (512, 768)
, mode: RGB
Mask size : (512, 768)
, mode: L
The mask is binary (contains only 0 and 255 values).
I’ve also tried using a simple manually created mask to ensure that FastSAM-generated masks aren’t causing the issue, but I still get the same error.
1 Like
It looks like you are stuck here, but I think this is a bug in Diffusers…?
# Copyright 2024 The HuggingFace Team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import math
import warnings
from typing import List, Optional, Tuple, Union
import numpy as np
import PIL.Image
This file has been truncated. show original
def is_valid_image(image):
return isinstance(image, PIL.Image.Image) or isinstance(image, (np.ndarray, torch.Tensor)) and image.ndim in (2, 3)
Maybe this is correct.
def is_valid_image(image):
return isinstance(image, PIL.Image.Image) or (isinstance(image, (np.ndarray, torch.Tensor)) and image.ndim in (2, 3))
ndim
is not an element of PIL.Image.Image
.
Currently, it should be possible to slip through this check by passing it in numpy format.
@sayakpaul I found a crappy bug in Diffusers.