Help with fine tune Stable Diffusion v1-5 on Pytorchlightnig

Hi,

The pipeline is only used at inference time. During fine-tuning, you need to load its individual building blocks (like the U-Net) and optimize them. It’s actually only the U-Net that gets fine-tuned, the VAE and text encoder are freezed as seen here.

Basically you would need to port this script to PyTorch Lightning: https://github.com/huggingface/diffusers/blob/main/examples/text_to_image/train_text_to_image.py.

1 Like