Hey all, I wanted to ask if there is any way to apply LoRA on model and then train model in training loop without
SFTTrainer
I am Using the following way, is it fine?
model_name = "meta-llama/Meta-Llama-3.1-8B-Instruct" # or any other model
model = AutoModelForCausalLM.from_pretrained(
model_name,
# load_in_8bit=True,
# quantization_config=quant_config,
device_map="auto",
max_memory={0: "4GiB", 1: "50GiB", "cpu": "0.0GiB"},
offload_state_dict=True,
low_cpu_mem_usage=True
)
...
model.config.use_cache = False
model.config.pretraining_tp = 1
model.gradient_checkpointing_enable()
model = prepare_model_for_kbit_training(model)
peft_params = LoraConfig(
r=32,
lora_alpha=64,
target_modules=[
"q_proj",
"v_proj",
],
lora_dropout=0.05,
bias="none",
task_type="CAUSAL_LM"
)
model = get_peft_model(model, peft_params)
model.print_trainable_parameters()
optimizer = Adam(model.parameters(), lr=1e-5)
for epoch_idx in [1]:
for batch_idx, batch in tqdm(enumerate(train_data), total=len(train_data)):
optimizer.zero_grad()
outputs = model(**inputs)
...
And then finally using
merge_and_unload()
to merge adapter back to the model