Working on Fine-Tuning LLMs – Need Some Expert Advice!
I’ve been experimenting with model training using both general user data and tool-specific data from our platform.
Initially, I combined (shuffled) both types of data for fine-tuning. Later, I shifted to a phased learning approach:
Phase 1: Fine-tuned the model with general platform data.
Phase 2: Re-tuned it with only tool-related data using adapters from Phase 1.
Despite this structured approach, I’m still facing a few recurring issues:
Why does the model randomly return the system prompt (e.g., the one provided in
system
role) in its replies?
Why does it ask for tool details even during general greetings or unrelated platform queries?
Why does it miss some required fields when constructing tool calls?
Why does it invent new tool parameters not defined in the schema?
Why doesn’t it ask for all required fields in plain text consistently?
After a tool call, why does it repeat previous answers instead of asking for required fields again if the same query is asked?
If anyone has tackled similar issues while fine-tuning LLMs (e.g., using LoRA adapters or phased training), I’d love to hear your thoughts or tips!
Feel free to comment or DM—any insights are truly appreciated. Thanks in advance!