Hyperparameter Tuning with LoRA configuration and PEFT

Hello, so I have a model, and a script based on this Tutorial from DataCamp that uses Phi3.5 for Classification:
Fine-Tuning Phi-3.5 on E-Commerce Classification Dataset | DataCamp

I want to know if I can do some hyperparameter search to maximize a specific metric.

My question is more related to be able to tune the LoRA configuration (add the variables such as r, lora_alpha or lora_dropout to the search space) in addition to the TrainingArguments and the Model Configuration variables.

Below is an example of a usual LoRA configuration:

output_dir="Phi-3.5-mini-instruct"

peft_config = LoraConfig(
    lora_alpha=16,
    lora_dropout=0,
    r=64,
    bias="none",
    task_type="CAUSAL_LM",
    target_modules=modules,
)

training_arguments = TrainingArguments(
    output_dir=output_dir,                    # directory to save and repository id
    num_train_epochs=1,                       # number of training epochs
    per_device_train_batch_size=1,            # batch size per device during training
    gradient_accumulation_steps=4,            # number of steps before performing a backward/update pass
    gradient_checkpointing=True,              # use gradient checkpointing to save memory
    optim="paged_adamw_8bit",
    logging_steps=1,                         
    learning_rate=2e-5,                       # learning rate, based on QLoRA paper
    weight_decay=0.001,
    fp16=False,
    bf16=False,
    max_grad_norm=0.3,                        # max gradient norm based on QLoRA paper
    max_steps=-1,
    warmup_ratio=0.03,                        # warmup ratio based on QLoRA paper
    group_by_length=False,
    lr_scheduler_type="cosine",               # use cosine learning rate scheduler
    report_to="wandb",                  # report metrics to w&b
    eval_strategy="steps",              # save checkpoint every epoch
    eval_steps = 0.2
)
1 Like

There doesn’t seem to be a generally accepted theory…

Hi
So, backing where I left off, I think I found a way to add the parameters from LoRAConfig to the search space.

I had to overwrite the the model_init() function:

lora_config_base = LoraConfig(
    lora_alpha=16,
    lora_dropout=0,
    r=64,
    bias="none",
    task_type="CAUSAL_LM",
    target_modules=modules,
)

model_config_base = AutoConfig.from_pretrained(model_base_name)

def model_init(params):
    model_config = deepcopy(model_config_base)
    lora_config = deepcopy(lora_config_base)
    if params is not None:
        #model config
        model_config.update({'model_config_variable'  params['model_config_variable']}) # depending on your base model, the name of the config variables you want to tune may be different.
        # model_config.update({...})
        # ...

        # lora config
        # Had to cast the variables to a weird issue with lora.layer that was receiving the values as tuples
        lora_config.r = int(params["r"])
        lora_config.lora_alpha = int(params["r"] // 2) # half of r 
        lora_config.lora_dropout = float(params["lora_dropout"])
        # lora.config.... = params[...] --> other variables you want to tune from LoraConfig
        # ...

    model = AutoModelForCausalLM.from_pretrained(model_base_name, ...)
    model = get_peft_model(model, lora_config)
    return model

And on the search space, add the variables from LoRA to tune:

def hp_space(trial):
    # Using ray as backend
    return {
        # ... 
        "r": tune.quniform(16, 32, 4),
        "lora_dropout": tune.quniform(0.1, 0.2, 0.01),
        # ...
    }

The model_init function is passed to the Trainer class and the hp_space to the trainer.hyperparameter_search() method

1 Like