Guidance Scale for Flux LoRA

I have been training dreambooth + lora for a while following the train script format from diffusers/examples/dreambooth/README_flux.md at main · huggingface/diffusers · GitHub

export MODEL_NAME="black-forest-labs/FLUX.1-dev"
export INSTANCE_DIR="dog"
export OUTPUT_DIR="trained-flux-lora"

accelerate launch train_dreambooth_lora_flux.py \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --instance_data_dir=$INSTANCE_DIR \
  --output_dir=$OUTPUT_DIR \
  --mixed_precision="bf16" \
  --instance_prompt="a photo of sks dog" \
  --resolution=512 \
  --train_batch_size=1 \
  --guidance_scale=1 \
  --gradient_accumulation_steps=4 \
  --optimizer="prodigy" \
  --learning_rate=1. \
  --report_to="wandb" \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --max_train_steps=500 \
  --validation_prompt="A photo of sks dog in a bucket" \
  --validation_epochs=25 \
  --seed="0" \
  --push_to_hub

My quetion is: why setting guidance scale as 1? As far as I know, a guidance scale of 1 is not ideal, and some preferabled values would be 3.5, 7, an etc.

For example, in the source code of huggingface/diffusers/blob/main/examples/dreambooth/train_dreambooth_lora_flux.py L460, guidance scale is default to 3.5:

    parser.add_argument(
        "--guidance_scale",
        type=float,
        default=3.5,
        help="the FLUX.1 dev variant is a guidance distilled model",
    )

As I’m working on Flux, I also checked the source code of FluxTransformer2DModel, which is used in the flux lora dreambooth code. In the forward() function, I see that the guidance scale is multiplied by 1000: diffusers/src/diffusers/models/transformers/transformer_flux.py at main · huggingface/diffusers · GitHub

        if guidance is not None:
            guidance = guidance.to(hidden_states.dtype) * 1000
        else:
            guidance = None

what’s the reason here?

1 Like

I don’t understand the theory, but I understand the reason. Let’s set it to 1.0. If kohya-ss says so, it must be correct.

kohya-ss on Aug 29, 2024
I don’t know about “Distilled CFG at 3.5”, but a model trained with guidance_scale 1.0 should require a guidance scale of around 3.5 for inference, just like the original model.

1 Like

lol thanks for the information!

1 Like