First, I am writing that I am visually impaired and it is very inconvenient to access information.
I do not know much about coding.
I wanted to have my own language model, so I created code and dataset using gpt3.
My purpose is to write scenarios.
I wanted to have a language model that can be run in koboldcpp.
I tried it after hearing advice from people who told me to use xwin-mlewd-7b-v0.2.Q4_K_M.gguf.
I felt many limitations in using this model.
It may have been because I do not speak English well.
So I decided to create my own language model.
I used llama-3.2-Korean-Bllossom-3B as the base model.
The reason is that there were not many models among the 3b models that supported Korean, and I chose it as the base model because it was freely available in terms of license.
I completed fine-tuning and model merging.
However, I cannot adjust the prompt.
This is because the codes output by gpt did not work properly. I have cloned and browsed many repositories on github but I can’t find what I want.
What I want is not a chatbot.
The question and answer format is not related to the model I want.
But the repositories on github only require the question and answer format.
How can I solve this problem?
Even when writing scenarios, I think a chatbot model is sufficient. Of course, there are models that are suitable and others that are not. It is necessary to instruct via system prompts to generate longer texts.
However, when it comes to writing scenarios, the 3B model might be too small. Additionally, it may be necessary to consider an LLM with long-context support.
Furthermore, for languages other than English, models like Gemma 2 or Qwen 2.5 are likely to perform better than Llama. For Korean, SOLAR or EVEE are probably well-known options.
It sounds like you’re trying to fine-tune Llama-3.2-Korean-Bllossom-3B for scenario writing, rather than a chatbot-style Q&A format. The challenge here is adjusting the prompt structure to align with your intended use case.
How to Adjust the Prompt for Scenario Writing
Since most repositories focus on Q&A-style prompts, you’ll need to restructure your input format to encourage storytelling rather than direct responses. Here’s how:
1. Use Narrative-Based Prompts
Instead of a standard Q&A format, try structuring prompts like:
Write a detailed scene where a detective uncovers a hidden clue in an abandoned house.
or
Describe a futuristic city where AI governs daily life, focusing on the interactions between humans and machines.
This encourages the model to generate full scenarios rather than short answers.
2. Modify System Instructions (If Using KoboldCPP)
If you’re running the model in KoboldCPP, you can adjust the system prompt to guide its behavior:
You are a creative writing assistant specializing in scenario development.
Your task is to generate immersive scenes, dialogues, and world-building elements.
Avoid Q&A formats and focus on storytelling.
This helps steer the model away from chatbot-style responses.
3. Fine-Tune with Scenario-Based Data
Since your fine-tuning process used GPT-generated datasets, ensure that your training data includes long-form narrative examples rather than Q&A pairs. You can:
- Collect story excerpts or screenplay dialogues.
- Format training data as structured scene descriptions.
- Use few-shot learning by providing multiple examples in a single prompt.
4. Explore Prompt Optimization Tools
Meta AI recently released Llama Prompt Ops, a toolkit designed to optimize prompts for Llama models43dcd9a7-70db-4a1f-b0ae-981daa162054. This could help refine your input structure to better suit scenario writing.
5. Adjust Token Length for Longer Outputs
If your model is generating short responses, increase the max tokens parameter in KoboldCPP to allow for longer, more detailed outputs.
Would you like help formatting a dataset for fine-tuning?