Explicit support of masked loss and schedulefree optimizers

ETA: this is my first massive involvement with scripts using diffusers, so I might not be getting some concepts for now, but I’m trying to learn as I go.

I’m trying to extend a script from the advanced_diffusion_training folder that deals with finetuning a dreambooth lora for flux (diffusers/examples/advanced_diffusion_training/train_dreambooth_lora_flux_advanced.py at main · huggingface/diffusers · GitHub),
but I’m trying to:

  1. add support for schedulefree optimizers (primarily AdamWScheduleFree)
  2. add a way to use masked loss (based on the mask images or alpha channel info). (possibly related to Full support for Flux attention masking · Issue #10194 · huggingface/diffusers · GitHub)

I’m basing both additions on the way it’s handled in sd-scripts by kohya-ss (sd-scripts/flux_train_network.py at sd3 · kohya-ss/sd-scripts · GitHub is the main source of inspiration), but

  1. I’m not sure I’m adding schedulefree optimizer train / eval switching in all the right places (before actual training, before sampling images, before saving checpoints, &c)
  2. the masked loss part has me stumped; I think that we can use the way the dataset is constructed (DreamBoothDataset has no out-of-the-box support for alpha_mask, but DreamBoothSubset from sd-scripts does), but maybe I’m missing something

My current attempts live here: A version of https://github.com/huggingface/diffusers/blob/main/examples/advanced_diffusion_training/train_dreambooth_lora_flux_advanced.py, but it tries to add schedulefree optimizers & masked loss training · GitHub

Any pointers would be welcome.

1 Like