ESM-2 QLoRA (gradient checkpointing not compatible?)

I’ve recently wandered from PEFT LoRA fine-tuning of ESM-2 models such as facebook/esm2_t6_8M_UR50D into Quantized Low Rank Adaptation (QLoRA) fine-tuning. However, it seems as though ESM-2 models do not allow gradient checkpointing. Does anyone know a workaround? See this issue on Github for example: EsmForSequenceClassification does not support gradient checkpointing · Issue #606 · facebookresearch/esm · GitHub

1 Like

Dear Amelie,

Have you solved it? I might try to help but I want to check first that it is still an open problem.

It seems to be mostly resolved, although there is still some issue with using biases. For example, if I have a LoRA config like the following, I cannot use all I can only use none:

    peft_config = LoraConfig(
        bias="none",  # or "all" or "lora_only"
        # modules_to_save=["classifier"]

Otherwise it seems to be working fine now. It wouldn’t hurt to have a second pair of eyes on it just to make sure everything is working properly.