Model Tuning and Re-Tuning Problems

neo911 · June 2, 2025, 2:40pm

Working on Fine-Tuning LLMs – Need Some Expert Advice!

I’ve been experimenting with model training using both general user data and tool-specific data from our platform.
Initially, I combined (shuffled) both types of data for fine-tuning. Later, I shifted to a phased learning approach:

Phase 1: Fine-tuned the model with general platform data.
Phase 2: Re-tuned it with only tool-related data using adapters from Phase 1.

Despite this structured approach, I’m still facing a few recurring issues:

Why does the model randomly return the system prompt (e.g., the one provided in system role) in its replies?
Why does it ask for tool details even during general greetings or unrelated platform queries?
Why does it miss some required fields when constructing tool calls?
Why does it invent new tool parameters not defined in the schema?
Why doesn’t it ask for all required fields in plain text consistently?
After a tool call, why does it repeat previous answers instead of asking for required fields again if the same query is asked?

If anyone has tackled similar issues while fine-tuning LLMs (e.g., using LoRA adapters or phased training), I’d love to hear your thoughts or tips!

Feel free to comment or DM—any insights are truly appreciated. Thanks in advance!

Mdrnfox · June 2, 2025, 3:14pm

Did your fine tuning contain the system prompts and the model was trained on seeing that?
You could add in the system prompt telling the model to not repeat the system prompt ( at inference) or strip it out from the output. Probably would use the first method to save token count.

I think you are encountering a default behavior learned from your data and the model can’t differentiate from user greetings and the user queries asking about the tools. I would add synthetic data of data containing normal NLP interactions between model and user to off set this to your phase 2. You could also add a label to the prompt to detect if the user is actually asking for a tool and to determine to output metadata.
My assumption, but your data is missing a consistent schema format and the model has underlearned the schema. You probably have data that is needed in some inputs and trimming the schemas in other inputs. You could include required tool schemas and optional tool schemas. You can validate the output json to confer correctness.
Again there is sample imbalance to know what it needs.
You need to work on the cache/retrieval logic or reissue the call again as a new query..

Hope this helps

Pimpcat-AU · June 10, 2025, 8:01pm

Always clean and standardize your dataset

def clean_dataset(data):
# Remove repeated system prompts and deduplicate examples
cleaned =
seen = set()
for ex in data:
key = (ex[‘prompt’], ex[‘completion’])
if key not in seen:
# Strip system prompts from completions
ex[‘completion’] = ex[‘completion’].replace(ex.get(‘system_prompt’, ‘’), ‘’)
cleaned.append(ex)
seen.add(key)
return cleaned

Validate all tool calls for missing or invented fields

def validate_tool_calls(tool_calls, schema):
valid_calls =
for call in tool_calls:
# Remove parameters not in schema
call[‘parameters’] = {k: v for k, v in call[‘parameters’].items() if k in schema}
# Add missing fields as None or default
for field in schema:
if field not in call[‘parameters’]:
call[‘parameters’][field] = None
valid_calls.append(call)
return valid_calls

Ensure plain text prompts always require all fields

def enforce_required_fields(prompt, required_fields):
for field in required_fields:
if field not in prompt:
prompt += f"\nPlease provide the following required field: {field}"
return prompt

Example usage:

data = load_your_data()

schema = {‘field1’, ‘field2’, ‘field3’}

data = clean_dataset(data)

for ex in data:

ex[‘tool_calls’] = validate_tool_calls(ex[‘tool_calls’], schema)

ex[‘prompt’] = enforce_required_fields(ex[‘prompt’], schema)

Optional: add randomness to reduce repeated answers

import random
def shuffle_answers(answers):
random.shuffle(answers)
return answers

If model is repeating previous answers, you may need to add noise or sample from a diverse training set.

Solution provided by Triskel Data Deterministic AI.

Topic		Replies	Views
Fine-tuning conversational models with the technical documentation Beginners	2	1276	July 18, 2024
Fine-tuning LLM for RAG Beginners	2	1119	June 10, 2024
Making fine-tuned LLM model more stable Beginners	3	976	December 30, 2023
Need Advice on Fine-Tuning for DSL Beginners	8	123	March 7, 2025
Ho to train System Prompt and Prompt Extensions at Fine Tune stage Beginners	1	45	March 25, 2025