What is the proper way to add LoRa adapters and keep some layers trainable?

Hello everyone!

Could someone explain why if I add LoRa adapters to a diffusion model using add_adapter method the model layers, which I would like to keep trainable (mentioned in the config parameter modules_to_save), actually are not trainable?
But they are trainable if I use get_peft_model method. Is it a bug or I misunderstand something?

diffusers 0.32.1
peft 0.14.0

from diffusers import AutoencoderKL
from peft import get_peft_model, LoraConfig


def print_trainable_parameters(model):
    trainable_params = 0
    all_param = 0
    count_trainable_conv_in = 0
    for name, param in model.named_parameters():
        all_param += param.numel()
        if name.find("encoder.conv_in") != -1:
            print(name, "requires_grad: ", param.requires_grad)
        if param.requires_grad:
            if name.find("encoder.conv_in") != -1:
                count_trainable_conv_in += 1
            trainable_params += param.numel()
    print(
        f"trainable params: {trainable_params} || all params: {all_param} || trainable%: {100 * trainable_params / all_param:.2f} || trainable_conv_in: {count_trainable_conv_in}"
    )


if __name__ == "__main__":
    vae = AutoencoderKL.from_pretrained("stabilityai/sd-turbo", subfolder="vae")
    encoder_param_names = [
        "conv1",
        "conv2",
        "conv_shortcut",
        "conv",
        "conv_out",
        "to_k",
        "to_q",
        "to_v",
        "to_out.0",
    ]

    module_names_to_keep = ["encoder.conv_in"]

    lora_config = LoraConfig(
        r=4,
        init_lora_weights="gaussian",
        target_modules=encoder_param_names,
        modules_to_save=module_names_to_keep,
    )

    adapted_vae = get_peft_model(vae, lora_config)
    print("Option one: using get_peft_model")
    print_trainable_parameters(adapted_vae)

    print("Option two: using add_adapter")
    vae.add_adapter(lora_config)
    print_trainable_parameters(vae)
1 Like

There is a problem in your script, which is quite easy to miss. When you call adapted_vae = get_peft_model(vae, lora_config), the function will actually modify the vae in-place. Therefore, when you later call vae.add_adapter(lora_config), the vae already has the LoRA adapter added, but diffusers does not expect that. If you re-create a fresh vae by calling vae = AutoencoderKL.from_pretrained(...) before calling vae.add_adapter, you should see the expected result.

It is still a bit unexpected that requires_grad=False for those layers, but as this whole pattern should not be used, I don’t think there is an immediate TODO from this.

1 Like

Oh, thanks! I missed that get_peft_model is in-place operation. But it doesn’t fix the problem with requires_fraud == False. Could you please share the proper way to add LoRa and keep some layers trainable?

1 Like

The diffusers code is not really prepared to handle modules_to_save, so it doesn’t set the gradients for that. It would be better to use a PEFT model for that, i.e. to use get_peft_model as in your first example. If you don’t want to use get_peft_model, you can always set the gradients manually, e.g. vae.encoder.conv_in.modules_to_save.requires_grad_(True).

I’ll check if there is a good way to handle it automatically.

1 Like

Noted, thank you very much! I wasn’t sure that manually setting vae.encoder.conv_in.modules_to_save.requires_grad_(True) won’t break some other logic.

1 Like