I’ve recently wandered from PEFT LoRA fine-tuning of ESM-2 models such as facebook/esm2_t6_8M_UR50D
into Quantized Low Rank Adaptation (QLoRA) fine-tuning. However, it seems as though ESM-2 models do not allow gradient checkpointing. Does anyone know a workaround? See this issue on Github for example: EsmForSequenceClassification does not support gradient checkpointing · Issue #606 · facebookresearch/esm · GitHub
1 Like
Dear Amelie,
Have you solved it? I might try to help but I want to check first that it is still an open problem.
It seems to be mostly resolved, although there is still some issue with using biases. For example, if I have a LoRA config like the following, I cannot use all
I can only use none
:
peft_config = LoraConfig(
task_type=TaskType.TOKEN_CLS,
inference_mode=False,
r=config["r"],
lora_alpha=config["lora_alpha"],
target_modules=[
"query",
"key",
"value",
"EsmSelfOutput.dense",
"EsmIntermediate.dense",
"EsmOutput.dense",
"EsmContactPredictionHead.regression",
"classifier"
],
lora_dropout=config["lora_dropout"],
bias="none", # or "all" or "lora_only"
# modules_to_save=["classifier"]
)
Otherwise it seems to be working fine now. It wouldn’t hurt to have a second pair of eyes on it just to make sure everything is working properly.