Entanglement issue of Dreambooth images

Hi, Thanks for the blog (Training Stable Diffusion with Dreambooth using Diffusers).
I have one question need your help. I used dreambooth similar as what you described here. After training, the generated images look fine. As my understanding, it can say the reconstructed images are good after training. But once I create text context prompt (similar as Dream Booth paper describe in appendix Figure 10: TOKEN on the beach/in the snow etc) . The appearance of generated image changed include color and pattern , similar as Failure case described in Figure 9(b). Do you have this experience, can please provide some suggestions? My task is to generate similar images as initial images and put them to different back-gounds/contexts with minimum changing of initial images.

Look forward to hearing from you soon, please help on this question! I have tried so many times. Thank you!

Could you provide information on the following?

  • The instance identifier you used (like sks was used in our blog)
  • The prompts you have tried so far.

@sayakpaul
Thank you so much.
Images: I tried different images, for instance: (1) I downloaded pictures from Tom shoes.this item(Women's Multi Color Alpargata Global Jaquard Rope Espadrille Slip On | TOMS) (2) I downloaded pictures from LV leather bags.
Codes: I used ShivamShrirao’s dreambooth optimized version to be able to fit 16GB single GPU.
(3) The instance identifier I used are not normal which is recommended, like I use xlwsh or something as instance identifier , class identifier is “TOMShoes” or “LV bags” or " Louis Vuitton bags"
(4) The prompts used are , for instances:
a: prompt = “photo of xlph toms shoe on top of green grass with sunflowers surround it” #@param {type:“string”}
negative_prompt = " not same pattern, not same content" #@param {type:“string”}
b: prompt = "photo of wsxlph leatherbag in front of an old house " #@param {type:“string”}
negative_prompt = “not same color, not same pattern, not same content” #@param {type:“string”}
The reconstructed images are fine, similar as initial images ( i disabled training preservation loss to try to less variations) , however when generate images using the above prompts, shoes might have sunflowers due to the prompt has “autum leave or sunflower for the scene” or pattern on shoes might change.
I also tried 7up softdrink, similarly, softdrink can design might change.

This is described in the dreambooth paper (Figure 9. Failure modes.)

Is there any better suggestions? I am reading this blog recently (https://metaphysic.ai/entanglement-in-image-synthesis/) Thanks a lot in advance.

Could you share some visual results with the respective prompts (and other relevant parameters) you tried? This will help us in better gauging what we can suggest.

Initial Input: Women’s Multi Color Alpargata Global Jaquard Rope Espadrille Slip On | TOMS
But was generated like:
Screenshot 2023-04-07 at 6.37.34 PM

Prompt is: prompt = “photo of xlph toms shoe on top of green grass with sunflowers surround it” #@param {type:“string”}

negative_prompt = " not same pattern, not same content" #@param {type:“string”}

num_samples = 4 #@param {type:“number”}

guidance_scale = 7.5 #@param {type:“number”}

num_inference_steps =60 #@param {type:“number”}

height = 512 #@param {type:“number”}

width = 512 #@param {type:“number”}