Adding object to a phot using stablediffusionpipeline

allausk808317 · February 10, 2025, 2:17am

I was playing with the pipeline trying to create a photo that my dog wearing a gold chain.

I create da pipline:
pipe = StableDiffusionImg2ImgPipeline.from_pretrained(“stabilityai/stable-diffusion-2-1”).to(device)

got a photo from my local machine :
init_image = Image.open(“/Users/alex/Downloads/Meelo1.png”).convert(“RGB”)

Generated a prompt and then pass prompt and the image to the pipline:
prompt = “A dog face wearing a thick gold chain”

image = pipe(prompt=prompt, image=resized_image, strength=0.3, num_inference_steps=50).images[0]

When I display the generated image, it was only the original photo with some noises. Nothing resemble a gold chain is generated. I also tried to add a baseball cap or hat, but getting the same result. I also tried to adjusted the prompt too.

My question is “Is this the correct and sufficient way to achieve what I wanted?”

Thanks,

Alex

John6666 · February 10, 2025, 6:15am

With Image-to-Image, the composition is likely to be preserved, but there is a strong tendency for everything else to be redrawn. If you increase the strength, the necklace may be drawn, but there is a high possibility that another dog will appear.

For your intended use, using Inpainting is probably the best approach. Using ControlNet allows you to do more advanced things, but it is simply difficult.

allausk808317 · February 10, 2025, 11:39pm

Thanks, I did try multiple runs and confirmed you said. I am going to try Inpainting to see the results.
I also find that if I use a large inference steps, (say 1000 strength = 0.3), it tends to produce an error like thisimage_process.py line 147 Runtime warning: invalid value encounter in cast images=(images*255).round().astype(“uint8”) and returned a black image. Do you know what’s causing this warning and returining of a black image? Thanks!

John6666 · February 11, 2025, 5:59am

The black image is often returned when using an old GeForce (10x0 generation) or when it is caught by safety_checker. I have also heard that it can also happen when there is simply not enough RAM…

Also, the number of steps is usually sufficient between 20 and 100. Around 28 is usually fine. Even with fairly complex images, there is no benefit to making them too big. It is more likely to make them look strange.
The same goes for guidance_scale, which should be between 3.5 and 7.5. Raising these parameters does not simply improve image quality.

allausk808317 · February 11, 2025, 1:46pm

Thanks a lot for the insights. They are very helpful.

Topic		Replies	Views
Img2Img keeps devolving into psychedelics Beginners	0	540	September 28, 2022
Access CLIP from StableDiffusionPipeline and use the same models for multiple pipelines 🧨 Diffusers	3	2601	October 11, 2023
Help verify StableDiffusion & CLIP weight sharing 🧨 Diffusers	0	527	December 13, 2022
Loading StableDiffusionXLPipeline with constructor gives distorted results Beginners	0	86	May 24, 2024
StableDiffusionInpaintPipeline 'NoneType' is not iterable 🧨 Diffusers	1	117	April 24, 2025

Adding object to a phot using stablediffusionpipeline

Related topics