I’m currently working on a student project where I’m trying to fine-tune a Flan-T5 Small model (using LoRA) to generate structured game maps based on creative prompts. I followed several tutorials and wanted to ask for advice because I can’t get the model to learn anything meaningful.
The idea is to transform an imaginative text prompt into a structured output (like a spatial map layout).
Example prompt:
A calm sandy beach with palm trees and old fishing boats
THEME: [beach, tropical, boats]
SIZE: (27x12)
PATTERN: striped_vertical
ZONE: Sea [position=left]
CELL TYPE: water
FEATURES: fishing_boat
ENEMIES: None
ZONE: Beach [position=right]
CELL TYPE: sand
FEATURES: palm_tree, crate, campfire
ENEMIES: None
RULES:
features_density=moderate
loot_density=normal
enemy_density=none
Typical instruction:
You are an AI level designer for a fantasy adventure game.
Your task is to transform creative environment prompts into structured spatial maps.
Each map must include:
- A rich list of THEMES capturing the essence of the place
- A SIZE in the format (height x width)
- A PATTERN describing the layout: maze, radial, organic, etc.
- One or more ZONE blocks, each with:
- A unique ID
- A POSITION (e.g., top-left, center, bottom-right)
- CELL TYPES (terrain types like lava, rock, grass)
- FEATURES (visuals or interactive elements)
- ENEMIES (monsters or traps)
- A RULES section describing element density and level characteristics
Make the output imaginative but always follow the format.
PROMPT:
An overgrown jungle ruin where ancient machines sleep under moss and roots, and wildlife has reclaimed the place.
OUTPUT:
Setup
Model: Flan-T5 Small
LoRA Config:
LoraConfig(
task_type=“SEQ_2_SEQ_LM”,
r=4,
lora_alpha=32,
lora_dropout=0.01,
target_modules=[“q”]
)
Trainable params: 86,016 / Total params: ~77M (~0.11%)
TrainingArguments(
output_dir=“./results”,
learning_rate=1e-3,
per_device_train_batch_size=4,
num_train_epochs=3,
weight_decay=0.01,
save_strategy=“no”
)
Batch size: 4
Dataset size: 36 examples (very small – but I was hoping for at least some signs of overfitting)
No special tokens used (except padding)
Problems
The model often:
Repeats instruction phrases (e.g., “A PATTERN describing the layout:…”)
Outputs incomplete or empty content
Copies the input prompt or behaves as if it's ignoring the structure
Hypotheses :
Is cross-entropy not well suited for creative generation tasks like this?
Is the model too small or not suitable for learning structured generation?
Is 36 examples simply too few, even for LoRA?
Would switching to a stricter JSON-based format help guide the model better?
I’d be really grateful for any advice, even rough suggestions or examples of similar cases. Thanks a lot in advance!