Does load_attn_procs work with safetensors?

Hi,

I’m trying to load a model and checkpoint with 0.14.0 version of the Diffusers (safetensors also installed) and am not able to get it working. The code looks like this:

    pipeline = pipeline_type.from_pretrained(
        "Lykon/DreamShaper",
        revision="main",
        torch_dtype=torch.float16,
    )

    pipeline.unet.load_attn_procs("Lykon/DreamShaper", revision="main", weight_name="DreamShaper_4BakedVae_fp16.safetensors", use_safetensors=True)

It downloads the file but then errors with invalid load key, '\x9f'. which kind of implies that diffusers thinks that it is a pickle file.

I am not having much luck fining working examples of load_attn_procs that work with anything other than a pytorch.bin file so maybe it just not there yet.

If anyone has gotten this to work, I would love to know it!

Don

1 Like

I’m running into the same issue.

It appears that the KeyError is caused by an assumption in the code that the keys with ‘.lora_down.weight’ and ‘.lora_up.weight’ will always be present for each block in the lora_grouped_dict . However, it seems that this may not be the case.

To fix this issue, try adding a check to ensure that the keys are present before accessing them. Modify the new_load_attn_procs function like this:

def new_load_attn_procs(pipe, tensors):
    lora_grouped_dict = defaultdict(dict)

    for key, value in tensors.items():
        if key.endswith(".lora_down.weight") or key.endswith(".lora_up.weight"):
            components = key.split("_")
            block = "_".join(components[:-1])
            lora_grouped_dict[block][components[-2] + "_" + components[-1]] = value

    for key, value_dict in lora_grouped_dict.items():
        if "to_k.lora_down.weight" not in value_dict or "to_k.lora_up.weight" not in value_dict:
            continue

        rank = value_dict["to_k.lora_down.weight"].shape[0]
        cross_attention_dim = value_dict["to_k.lora_down.weight"].shape[1]
        hidden_size = value_dict["to_k.lora_up.weight"].shape[0]

        # Assuming the `pipe.unet` has a method to get attention processor by its key
        attn_processor = pipe.unet.get_attention_processor_by_key(key)
        attn_processor.lora_rank = rank
        attn_processor.cross_attention_dim = cross_attention_dim
        attn_processor.hidden_size = hidden_size
        attn_processor.to_k.lora_down.load_state_dict({"weight": value_dict["to_k.lora_down.weight"]})
        attn_processor.to_k.lora_up.load_state_dict({"weight": value_dict["to_k.lora_up.weight"]})

# Replace the original call to `pipe.unet.load_attn_procs(tensors)` with the new function
new_load_attn_procs(pipe, tensors)

Let me know how it goes.

3 Likes

Thanks for reporting this! I just opened a GitHub issue to track this down.

1 Like

Thanks @slavakurilyak! Appreciate your input.

Trying to understand this better – could you help me with a scenario of why the key mismatch would be caused?