Finetuning a Large Language Model

I want to finetune a Phi-3-mini-4k-instruct model for text2sql task. I have a very small dataset with 100 questions. I have also taken around 1000 questions from hugginface dataset(b-mc2 /sql-create-context). Initially i tried to finetune the model with 30 epochs on an L4 PU to see the result. When i did inference with test data i got a bizarre answer from the fine tuned model. The answer didn’t even made a sense. This behavior of the model is because of less data/training epochs and will this be improved with larger training epochs?
I’m confused here on how to finetune a model for specific task to get desired answer. Below are the details of the training.
FineTune Strategy : QLoRA
Library : SFTTrainer
Bitsandbytes Config : (
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type=“nf4”,
bnb_4bit_compute_dtype=torch.bfloat16,
load_in_8bit=False,
)
PeftConfig : (
r=8,
lora_alpha=16,
target_modules=“all-linear”,
lora_dropout=0.05,
bias=“none”,
task_type=“CAUSAL_LM”,
)
Trainer Config : (per_device_train_batch_size=4,eval_strategy=“steps”,
gradient_accumulation_steps=4, optim=“paged_adamw_32bit”, do_eval=True, learning_rate=2e-4, lr_scheduler_type=“cosine”,save_strategy=“epoch”,
gradient_checkpointing=True, overwrite_output_dir=True, num_train_epochs=7, bf16=True,
)

1 Like