Hi! I am trying to fine tune meta-llama/Llama-2-7b-chat-hf model for text to sql task. I am using b-mc2/sql-create-context dataset to get the SQL queries for the given context in response, after fine tuning the model even gave worse results instead of SQL queries it gave random statements as a response starting from varchar also appending random numbers in the answer.
For fine tuning the llm I have used the below format for the prompt.
<s> [INST] <SYS>
You are a powerful text-to-SQL model. Your job is to answer questions about a database. You are given a question and context regarding one or more tables.
You must output the SQL query that answers the question.
</SYS>
### Question:
{question}
### Context:
{context}
### Response:
[/INST] {answer} </s>
Below are the configurations I have tried while fine tuning.
model = AutoModelForCausalLM.from_pretrained(
"meta-llama/Llama-2-7b-chat-hf",
load_in_8bit=True,
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-chat-hf")
config = LoraConfig(
r=16,
lora_alpha=32,
lora_dropout=0.05,
bias="none",
task_type="CAUSAL_LM"
)
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
trainer = transformers.Trainer(
model=model,
train_dataset=data['train'],
args=transformers.TrainingArguments(
per_device_train_batch_size=16,
gradient_accumulation_steps=8,
warmup_steps=100,
num_train_epochs=1,
learning_rate=2e-4,
fp16=True,
logging_steps=1,
output_dir='outputs'
),
data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm=False)
)
The dataset size is 78.6k, I have tried it with 1 epoch and 3 epochs as well after increasing the epoch the generated sql queries degraded.
Please suggest improvements that can be made to the code. Additionally, any relevant resources to better understand fine-tuning for a text-to-SQL task would be greatly appreciated.