[Dreambooth] HELP! Dreambooth multi-subjects training makes the results quality worse?

Overview

Hi I finetuned from stable diffusion v1.5 with dreambooth method, it worked fine when I trained single subject: car/dog. But the results quality decreased when I trained both categories at the same time with script: train_multi_subject_dreambooth.py

Details

Dataset
Car
l52ZopPQw6PDncKglZ2TrpGcmJ7A
l52ZopPQw6SX0ZOgl52TrpGcmJ7A

Dog
p1-512
p2-512
p3-512

1.Single-subject experiment

--Dog
export MODEL_NAME="/home/mobile360/data/lucien/model/diffusers/stable-diffusion-v1-5"
export INSTANCE_DIR="/home/mobile360/data/lucien/data/task/diffusion-multi/instance/dog3"
export CLASS_DIR="/home/mobile360/data/lucien/data/task/diffusion-multi/class/photo-of-dog"
export OUTPUT_DIR="/home/mobile360/data/lucien/data/task/diffusion-multi/ckpt/output-RPQS-dog-v5"
accelerate launch --mixed_precision="fp16" train_dreambooth.py \
  --pretrained_model_name_or_path=$MODEL_NAME \
  --instance_data_dir=$INSTANCE_DIR \
  --class_data_dir=$CLASS_DIR \
  --output_dir=$OUTPUT_DIR \
  --with_prior_preservation --prior_loss_weight=1.0 \
  --instance_prompt="a photo of RPQS dog" \
  --class_prompt="a photo of dog" \
  --resolution=512 \
  --train_batch_size=1 \
  --sample_batch_size=8 \
  --gradient_accumulation_steps=2 --gradient_checkpointing \
  --use_8bit_adam \
  --learning_rate=2e-6 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --num_class_images=400 \
  --checkpointing_steps=600 \
  --max_train_steps=600

--Car
export MODEL_NAME="/home/mobile360/data/lucien/model/diffusers/stable-diffusion-v1-5"
export INSTANCE_DIR="/home/mobile360/data/lucien/data/task/diffusion-multi/instance/car2"
export CLASS_DIR="/home/mobile360/data/lucien/data/task/diffusion-multi/class/photo-of-car"
export OUTPUT_DIR="/home/mobile360/data/lucien/data/task/diffusion-multi/ckpt/output-TKIB-car-v1"
accelerate launch --mixed_precision="fp16" train_dreambooth.py \
  --pretrained_model_name_or_path=$MODEL_NAME \
  --instance_data_dir=$INSTANCE_DIR \
  --class_data_dir=$CLASS_DIR \
  --output_dir=$OUTPUT_DIR \
  --with_prior_preservation --prior_loss_weight=1.0 \
  --instance_prompt="a photo of TKIB car" \
  --class_prompt="a photo of car" \
  --resolution=512 \
  --train_batch_size=1 \
  --sample_batch_size=8 \
  --gradient_accumulation_steps=2 --gradient_checkpointing \
  --use_8bit_adam \
  --learning_rate=2e-6 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --num_class_images=400 \
  --checkpointing_steps=600 \
  --max_train_steps=600

Results - fine
A photo of RPQS dog
image
A photo of red TKIB car
image

2.Multi-subjects experiment

export MODEL_NAME="/home/mobile360/data/lucien/model/diffusers/stable-diffusion-v1-5"
export OUTPUT_DIR="/home/mobile360/data/lucien/data/task/diffusion-multi/ckpt/output-multi-v16"
# Subject 1
export INSTANCE_DIR_1="/home/mobile360/data/lucien/data/task/diffusion-multi/instance/dog4"
export INSTANCE_PROMPT_1="a photo of RPQS dog"
export CLASS_DIR_1="/home/mobile360/data/lucien/data/task/diffusion-multi/class/photo-of-dog/"
export CLASS_PROMPT_1="a photo of dog"
# Subject 2
export INSTANCE_DIR_2="/home/mobile360/data/lucien/data/task/diffusion-multi/instance/car2"
export INSTANCE_PROMPT_2="A photo of TKIB car"
export CLASS_DIR_2="/home/mobile360/data/lucien/data/task/diffusion-multi/class/photo-of-car"
export CLASS_PROMPT_2="A photo of car"

accelerate launch train_multi_subject_dreambooth.py \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --instance_data_dir="$INSTANCE_DIR_1,$INSTANCE_DIR_2" \
  --output_dir=$OUTPUT_DIR \
  --instance_prompt="$INSTANCE_PROMPT_1,$INSTANCE_PROMPT_2" \
  --with_prior_preservation \
  --prior_loss_weight=1.0 \
  --class_data_dir="$CLASS_DIR_1,$CLASS_DIR_2" \
  --class_prompt="$CLASS_PROMPT_1,$CLASS_PROMPT_2"\
  --num_class_images=400 \
  --resolution=512 \
  --train_batch_size=1 \
  --sample_batch_size=8 \
  --gradient_checkpointing \
  --use_8bit_adam \
  --learning_rate=1e-6 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --checkpointing_steps=800 \
  --max_train_steps=800

Results - bad
A photo of RPQS dog
image
A photo of red TKIB car
image
A TKIB car with a RPQS dog
image

The results of multi-subjects experiment get worse quality and seem to be fake images.

Questions

1. From the perspective of principle, does dreambooth support multi-subjects training?

2. Are there any successful cases of multi-subjects training with train_multi_subject_dreambooth.py. What should I pay attention to during the multi-subjects training?

3. In my experiment, what might be the reason for the bad results? Is this result due to SR not being finetuned?

Hey @lluu that script isn’t officially supported because we haven’t done much work on multi subject dreambooth and bad quality is to be expected

Hi @williamberman. So dreambooth may not support multi-subject training in principle? Is it because there are no images of two subjects appearing simultaneously during training?

It’s just not a well explored topic. Unfortunately I don’t have any insight on why :slight_smile: