[Dreambooth] HELP! Dreambooth multi-subjects training makes the results quality worse?

Overview

Hi I finetuned from stable diffusion v1.5 with dreambooth method, it worked fine when I trained single subject: car/dog. But the results quality decreased when I trained both categories at the same time with script: train_multi_subject_dreambooth.py

Details

Dataset
Car
l52ZopPQw6PDncKglZ2TrpGcmJ7A
l52ZopPQw6SX0ZOgl52TrpGcmJ7A

Dog
p1-512
p2-512
p3-512

1.Single-subject experiment

--Dog
export MODEL_NAME="/home/mobile360/data/lucien/model/diffusers/stable-diffusion-v1-5"
export INSTANCE_DIR="/home/mobile360/data/lucien/data/task/diffusion-multi/instance/dog3"
export CLASS_DIR="/home/mobile360/data/lucien/data/task/diffusion-multi/class/photo-of-dog"
export OUTPUT_DIR="/home/mobile360/data/lucien/data/task/diffusion-multi/ckpt/output-RPQS-dog-v5"
accelerate launch --mixed_precision="fp16" train_dreambooth.py \
  --pretrained_model_name_or_path=$MODEL_NAME \
  --instance_data_dir=$INSTANCE_DIR \
  --class_data_dir=$CLASS_DIR \
  --output_dir=$OUTPUT_DIR \
  --with_prior_preservation --prior_loss_weight=1.0 \
  --instance_prompt="a photo of RPQS dog" \
  --class_prompt="a photo of dog" \
  --resolution=512 \
  --train_batch_size=1 \
  --sample_batch_size=8 \
  --gradient_accumulation_steps=2 --gradient_checkpointing \
  --use_8bit_adam \
  --learning_rate=2e-6 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --num_class_images=400 \
  --checkpointing_steps=600 \
  --max_train_steps=600

--Car
export MODEL_NAME="/home/mobile360/data/lucien/model/diffusers/stable-diffusion-v1-5"
export INSTANCE_DIR="/home/mobile360/data/lucien/data/task/diffusion-multi/instance/car2"
export CLASS_DIR="/home/mobile360/data/lucien/data/task/diffusion-multi/class/photo-of-car"
export OUTPUT_DIR="/home/mobile360/data/lucien/data/task/diffusion-multi/ckpt/output-TKIB-car-v1"
accelerate launch --mixed_precision="fp16" train_dreambooth.py \
  --pretrained_model_name_or_path=$MODEL_NAME \
  --instance_data_dir=$INSTANCE_DIR \
  --class_data_dir=$CLASS_DIR \
  --output_dir=$OUTPUT_DIR \
  --with_prior_preservation --prior_loss_weight=1.0 \
  --instance_prompt="a photo of TKIB car" \
  --class_prompt="a photo of car" \
  --resolution=512 \
  --train_batch_size=1 \
  --sample_batch_size=8 \
  --gradient_accumulation_steps=2 --gradient_checkpointing \
  --use_8bit_adam \
  --learning_rate=2e-6 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --num_class_images=400 \
  --checkpointing_steps=600 \
  --max_train_steps=600

Results - fine
A photo of RPQS dog
image
A photo of red TKIB car
image

2.Multi-subjects experiment

export MODEL_NAME="/home/mobile360/data/lucien/model/diffusers/stable-diffusion-v1-5"
export OUTPUT_DIR="/home/mobile360/data/lucien/data/task/diffusion-multi/ckpt/output-multi-v16"
# Subject 1
export INSTANCE_DIR_1="/home/mobile360/data/lucien/data/task/diffusion-multi/instance/dog4"
export INSTANCE_PROMPT_1="a photo of RPQS dog"
export CLASS_DIR_1="/home/mobile360/data/lucien/data/task/diffusion-multi/class/photo-of-dog/"
export CLASS_PROMPT_1="a photo of dog"
# Subject 2
export INSTANCE_DIR_2="/home/mobile360/data/lucien/data/task/diffusion-multi/instance/car2"
export INSTANCE_PROMPT_2="A photo of TKIB car"
export CLASS_DIR_2="/home/mobile360/data/lucien/data/task/diffusion-multi/class/photo-of-car"
export CLASS_PROMPT_2="A photo of car"

accelerate launch train_multi_subject_dreambooth.py \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --instance_data_dir="$INSTANCE_DIR_1,$INSTANCE_DIR_2" \
  --output_dir=$OUTPUT_DIR \
  --instance_prompt="$INSTANCE_PROMPT_1,$INSTANCE_PROMPT_2" \
  --with_prior_preservation \
  --prior_loss_weight=1.0 \
  --class_data_dir="$CLASS_DIR_1,$CLASS_DIR_2" \
  --class_prompt="$CLASS_PROMPT_1,$CLASS_PROMPT_2"\
  --num_class_images=400 \
  --resolution=512 \
  --train_batch_size=1 \
  --sample_batch_size=8 \
  --gradient_checkpointing \
  --use_8bit_adam \
  --learning_rate=1e-6 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --checkpointing_steps=800 \
  --max_train_steps=800

Results - bad
A photo of RPQS dog
image
A photo of red TKIB car
image
A TKIB car with a RPQS dog
image

The results of multi-subjects experiment get worse quality and seem to be fake images.

Questions

1. From the perspective of principle, does dreambooth support multi-subjects training?

2. Are there any successful cases of multi-subjects training with train_multi_subject_dreambooth.py. What should I pay attention to during the multi-subjects training?

3. In my experiment, what might be the reason for the bad results? Is this result due to SR not being finetuned?

Hey @lluu that script isn’t officially supported because we haven’t done much work on multi subject dreambooth and bad quality is to be expected

Hi @williamberman. So dreambooth may not support multi-subject training in principle? Is it because there are no images of two subjects appearing simultaneously during training?

It’s just not a well explored topic. Unfortunately I don’t have any insight on why :slight_smile:

Okay. Thanks a lot ! :slightly_smiling_face:

Hello @lluu,

according to this in-depth analysis, DreamBooth works best when the text encoder is trained as well. This is even more important when training on multiple subjects, as it helps the model differentiate between your subjects and their associated tokens.

Try it with the --train_text_encoder flag and see if it helps.

1 Like

@lluu did you tried with text encoder , please let me how are the results, If you has any other way to train multiple subjects using dreambooth , Please share here

Thank you