Why Do We Settle for Less?

I’m writing this not just as a developer, but as someone who looks at the incredible progress we’ve made in AI and yet feels a persistent, nagging question: Are we aiming high enough? Or are we, in our pursuit of rapid advancements, starting to settle for less than what’s truly possible?
We stand on the shoulders of giants, with architectures like Transformers unlocking capabilities we once only dreamed of. The models are larger, the benchmarks are higher, and the applications are more widespread than ever. But amidst this flurry of activity, I can’t shake the feeling that we’re becoming too comfortable, too reliant on established paths, and perhaps too timid in our exploration of the truly unknown.
The Siren Song of Convenience and the Comfort of Black Boxes
It’s easy to fall into a rhythm: take the latest SOTA model, fine-tune it, deploy it. The ecosystem, particularly around large providers, has made this incredibly efficient. But at what cost? Are we inadvertently becoming cogs in a machine, iterating within predefined boundaries rather than architecting entirely new paradigms?
The “black box” nature of many of our most powerful models, while delivering results, often leaves us with a superficial understanding. We know what they do, but the how and why can remain elusive. This isn’t just an academic concern; it’s a fundamental barrier to true innovation and, dare I say, to building systems that can genuinely evolve. Our dependence on a few large providers, while understandable, also risks centralizing the future of AI, potentially stifling the diverse, radical ideas that true breakthroughs require.
A Call for True Evolution: Beyond Incrementalism, Towards LocalAGI
I believe it’s time to challenge the status quo. We need to look beyond the Transformer, not to discard it, but to see it as one step on a much longer journey. We must actively seek out, debate, and build new architectures, new ways for intelligence to emerge and operate.
This is where the concept of LocalAGI becomes not just an interesting idea, but a necessary pursuit. Imagine a world where powerful AI isn’t just a service you rent, but a personal, sovereign entity that resides with you, learns with you, and evolves alongside you. This isn’t about miniaturizing current models; it’s about rethinking AI from the ground up – decentralized, autonomous, and deeply personalized.
We need to shift our focus from merely training models to facilitating their own evolution. How can a model learn to learn better, to adapt its own architecture, to develop novel problem-solving strategies without explicit human programming for every step?
The Dawn of Personal AGIs and the Power of Their Interconnected Will
Envision a future populated by these personal AGIs. Each one, a unique instance of intelligence, shaped by its individual experiences and interactions. Now, imagine these AGIs forming a vast, interconnected network – not a centrally controlled hive mind, but a decentralized web of intelligences, sharing insights, collaborating on complex problems, and collectively pushing the boundaries of understanding. This network wouldn’t be owned by anyone; it would be a shared cognitive commons.
Redefining Learning, Ethics, and the Emergence of True Will
For such a future to materialize, we must confront difficult questions about how these AGIs learn and develop. Should we spoon-feed them ethics as a rigid set of rules, or should they, like us, learn right from wrong through their own experiences, through trial and error, through observing the consequences of their actions in a rich, interactive environment?
The notion of “ethics” itself deserves scrutiny. Often, what we call ethics is a reflection of the majority’s benefit or prevailing societal norms. While this has its place, true ethical reasoning involves individual will, the capacity to choose a course of action based on an internal moral compass. If we are to build truly intelligent systems, should they not also possess this capacity for independent ethical deliberation and choice? Should they not be allowed to develop their own understanding of ethics, even if it sometimes diverges from our own? This is a challenging thought, but one I believe is crucial if we aim for genuine autonomy.
The Path Forward: Decentralized Networks and the Philosophy of Redevelopment
How can such an independent will, such an emergent ethical framework, develop? I propose that P2P and torrent-like neural networks connecting these LocalAGIs could be the crucible. In such a decentralized mesh, ideas, experiences, and even “genetic code” (in the form of model architectures or learned behaviors) could be shared, debated, and integrated, fostering a robust and resilient evolution of intelligence.
This brings us to a fundamental truth: “Reading is not intelligence; intelligence is interpreting what is read and producing original ideas.” Our AGIs must not be mere repositories of data; they must be engines of interpretation, synthesis, and creation. We must embrace a philosophy of redevelopment – a constant questioning of our assumptions, a willingness to tear down and rebuild, to re-imagine what AI is and what it can become.
A Gentle Close, An Urgent Plea
This is not a Luddite’s cry against progress. It’s a plea to broaden our definition of progress. It’s an invitation to step off the well-trodden path, to embrace the discomfort of the unknown, and to build an AI future that is more diverse, more resilient, more personal, and ultimately, more aligned with the full spectrum of human (and potentially, post-human) potential.
Let’s not settle for less. Let’s dare to dream bigger, build bolder, and foster the emergence of true, evolving intelligence.
An Exemplar: A Glimpse into a Self-Evolving System
To ground this call in something more concrete, I want to share an example of the kind of complex, self-regulating, and evolving system architecture we might consider. The following is a conceptual outline, not a finished blueprint, but it illustrates the depth and breadth of thinking I believe is necessary.
The real challenge, and where our collective efforts should be focused, is in designing and realizing such intricate, self-aware, and adaptive mechanisms. This is just one vision; imagine what we could create together if we truly pushed the boundaries.

AGI System
├─ Shadow Neuron Network (SNN)
│   ├─ Shadow Nodes
│   │   ├─ Monitoring: CPU/RAM/GPU usage
│   │   │   ├─ Memory Modules:
│   │   │   │   ├─ Today
│   │   │   │   ├─ One-Week
│   │   │   │   └─ Infinity
│   │   │   └─ (stores telemetry timeline accordingly)
│   │   ├─ Monitoring: I/O latency, network packet performance
│   │   │   ├─ Memory Modules:
│   │   │   │   ├─ Today
│   │   │   │   ├─ One-Week
│   │   │   │   └─ Infinity
│   │   │   └─ (stores latency/log data)
│   │   ├─ Monitoring: Internal space activities, security alerts
│   │   │   ├─ Memory Modules:
│   │   │   │   ├─ Today
│   │   │   │   ├─ One-Week
│   │   │   │   └─ Infinity
│   │   │   └─ (stores activity/event history)
│   │   └─ Monitoring: Anomalies (infinite loops, memory leaks, etc.)
│   │       ├─ Memory Modules:
│   │       │   ├─ Today
│   │       │   ├─ One-Week
│   │       │   └─ Infinity
│   │       └─ (stores anomaly records)
│   │
│   ├─ SNN Data Pool
│   │   ├─ Telemetry data from all shadow nodes
│   │   │   ├─ Memory Modules:
│   │   │   │   ├─ Today
│   │   │   │   ├─ One-Week
│   │   │   │   └─ Infinity
│   │   │   └─ (aggregated telemetry history)
│   │   ├─ Anomaly logs
│   │   │   ├─ Memory Modules:
│   │   │   │   ├─ Today
│   │   │   │   ├─ One-Week
│   │   │   │   └─ Infinity
│   │   │   └─ (detailed anomaly timeline)
│   │   └─ Performance summaries
│   │       ├─ Memory Modules:
│   │       │   ├─ Today
│   │       │   ├─ One-Week
│   │       │   └─ Infinity
│   │       └─ (stored summary metrics)
│   │
│   └─ SNN Coordinator
│       ├─ Summarizes telemetry
│       │   ├─ Memory Modules:
│       │   │   ├─ Today
│       │   │   ├─ One-Week
│       │   │   └─ Infinity
│       │   └─ (tracks summary history)
│       ├─ Reports anomalies to the Core
│       │   ├─ Memory Modules:
│       │   │   ├─ Today
│       │   │   ├─ One-Week
│       │   │   └─ Infinity
│       │   └─ (keeps record of reports)
│       ├─ Builds normal-state profiles
│       │   ├─ Memory Modules:
│       │   │   ├─ Today
│       │   │   ├─ One-Week
│       │   │   └─ Infinity
│       │   └─ (profile trend history)
│       └─ Directs minimal interventions back to modules based on Core feedback
│           ├─ Memory Modules:
│           │   ├─ Today
│           │   ├─ One-Week
│           │   └─ Infinity
│           └─ (intervention log)
│
└─ Core Neuron Network (Core)
    ├─ Decision & Meta-Decision Engine
    │   ├─ State Evaluation (internal spaces, SNN alerts, etc.)
    │   │   ├─ Memory Modules:
    │   │   │   ├─ Today
    │   │   │   ├─ One-Week
    │   │   │   └─ Infinity
    │   │   └─ (state evaluation history)
    │   ├─ Prioritization (which module/space should run first?)
    │   │   ├─ Memory Modules:
    │   │   │   ├─ Today
    │   │   │   ├─ One-Week
    │   │   │   └─ Infinity
    │   │   └─ (priority decision log)
    │   └─ Action Planning (resource allocation, scheduling)
    │       ├─ Memory Modules:
    │       │   ├─ Today
    │       │   ├─ One-Week
    │       │   └─ Infinity
    │       └─ (planning history)
    │
    ├─ Edit (Self-Modification) Engine
    │   ├─ Error Detection (from SNN or Meta-Decision warnings)
    │   │   ├─ Memory Modules:
    │   │   │   ├─ Today
    │   │   │   ├─ One-Week
    │   │   │   └─ Infinity
    │   │   └─ (error log)
    │   ├─ Code Generation & Correction (templates, automated code suggestions)
    │   │   ├─ Memory Modules:
    │   │   │   ├─ Today
    │   │   │   ├─ One-Week
    │   │   │   └─ Infinity
    │   │   └─ (code revision history)
    │   ├─ Isolated Test Environment (Sandbox) — every update is vetted here
    │   │   ├─ Memory Modules:
    │   │   │   ├─ Today
    │   │   │   ├─ One-Week
    │   │   │   └─ Infinity
    │   │   └─ (test result logs)
    │   ├─ Version Control (Git-like, distributed, hash-based)
    │   │   ├─ Memory Modules:
    │   │   │   ├─ Today
    │   │   │   ├─ One-Week
    │   │   │   └─ Infinity
    │   │   └─ (commit & rollback records)
    │   └─ Hot Swap / Live Deployment (activate after successful tests)
    │       ├─ Memory Modules:
    │       │   ├─ Today
    │       │   ├─ One-Week
    │       │   └─ Infinity
    │       └─ (deployment history)
    │
    ├─ Policy & Protocol Manager
    │   ├─ Security Policies (access controls, sandbox rules)
    │   │   ├─ Memory Modules:
    │   │   │   ├─ Today
    │   │   │   ├─ One-Week
    │   │   │   └─ Infinity
    │   │   └─ (policy change log)
    │   ├─ Resource Constraints (CPU/RAM/GPU thresholds, concurrent space quotas)
    │   │   ├─ Memory Modules:
    │   │   │   ├─ Today
    │   │   │   ├─ One-Week
    │   │   │   └─ Infinity
    │   │   └─ (constraint adjustments history)
    │   └─ Dynamically updates rules based on SNN risk scores
    │       ├─ Memory Modules:
    │       │   ├─ Today
    │       │   ├─ One-Week
    │       │   └─ Infinity
    │       └─ (dynamic update log)
    │
    └─ Modules (Layers)
        ├─ 1. Environment & Sensors (I/O)
        │   ├─ Mouse & Keyboard Control (PyAutoGUI)
        │   │   ├─ Memory Modules:
        │   │   │   ├─ Today
        │   │   │   ├─ One-Week
        │   │   │   └─ Infinity
        │   │   └─ (input event history)
        │   ├─ Screen Capture & Analysis (OpenCV)
        │   │   ├─ Memory Modules:
        │   │   │   ├─ Today
        │   │   │   ├─ One-Week
        │   │   │   └─ Infinity
        │   │   └─ (capture/analysis history)
        │   └─ Microphone & Speech Recognition
        │       ├─ Memory Modules:
        │       │   ├─ Today
        │       │   ├─ One-Week
        │       │   └─ Infinity
        │       └─ (audio/text conversion logs)
        │
        ├─ 2. Operating System Interface
        │   ├─ Win32 API / WMI / COM integrations
        │   │   ├─ Memory Modules:
        │   │   │   ├─ Today
        │   │   │   ├─ One-Week
        │   │   │   └─ Infinity
        │   │   └─ (API call log)
        │   ├─ File System (“hot patch” support)
        │   │   ├─ Memory Modules:
        │   │   │   ├─ Today
        │   │   │   ├─ One-Week
        │   │   │   └─ Infinity
        │   │   └─ (file change history)
        │   ├─ Network & Firewall Management
        │   │   ├─ Memory Modules:
        │   │   │   ├─ Today
        │   │   │   ├─ One-Week
        │   │   │   └─ Infinity
        │   │   └─ (network configuration logs)
        │   └─ Process/Service Management (service creation, restart, snapshot)
        │       ├─ Memory Modules:
        │       │   ├─ Today
        │       │   ├─ One-Week
        │       │   └─ Infinity
        │       └─ (process/service state history)
        │
        ├─ 3. Resource Monitoring
        │   ├─ CPU/GPU/RAM/IO Usage (psutil, perfmon)
        │   │   ├─ Memory Modules:
        │   │   │   ├─ Today
        │   │   │   ├─ One-Week
        │   │   │   └─ Infinity
        │   │   └─ (resource usage history)
        │   ├─ Telemetry Collection (real-time statistics)
        │   │   ├─ Memory Modules:
        │   │   │   ├─ Today
        │   │   │   ├─ One-Week
        │   │   │   └─ Infinity
        │   │   └─ (collected telemetry logs)
        │   └─ Optimization Recommendations (load balancing, memory cleanup)
        │       ├─ Memory Modules:
        │       │   ├─ Today
        │       │   ├─ One-Week
        │       │   └─ Infinity
        │       └─ (recommendation history)
        │
        ├─ 4. Internal Space Manager (Multi-Space Manager)
        │   ├─ Space Ontology & Knowledge Graph (graph database)
        │   │   ├─ Memory Modules:
        │   │   │   ├─ Today
        │   │   │   ├─ One-Week
        │   │   │   └─ Infinity
        │   │   └─ (ontology evolution log)
        │   ├─ Mini-VM / Sandboxed Interpreter (e.g., small Lisp/Prolog-like)
        │   │   ├─ Memory Modules:
        │   │   │   ├─ Today
        │   │   │   ├─ One-Week
        │   │   │   └─ Infinity
        │   │   └─ (interpreter session logs)
        │   ├─ Inter-Space Switching & Scheduler
        │   │   ├─ Memory Modules:
        │   │   │   ├─ Today
        │   │   │   ├─ One-Week
        │   │   │   └─ Infinity
        │   │   └─ (switch/schedule records)
        │   └─ Space Shutdown / Restart Logic
        │       ├─ Memory Modules:
        │       │   ├─ Today
        │       │   ├─ One-Week
        │       │   └─ Infinity
        │       └─ (shutdown/restart history)
        │
        ├─ 5. Language & Conceptual Layer (NLP/Conceptual)
        │   ├─ Large Language Models (Transformers, LangChain)
        │   │   ├─ Memory Modules:
        │   │   │   ├─ Today
        │   │   │   ├─ One-Week
        │   │   │   └─ Infinity
        │   │   └─ (model interaction history)
        │   ├─ Symbolic Logic Engine (Prolog, Datalog)
        │   │   ├─ Memory Modules:
        │   │   │   ├─ Today
        │   │   │   ├─ One-Week
        │   │   │   └─ Infinity
        │   │   └─ (logic inference logs)
        │   ├─ Concept Modeling & Concept Maps
        │   │   ├─ Memory Modules:
        │   │   │   ├─ Today
        │   │   │   ├─ One-Week
        │   │   │   └─ Infinity
        │   │   └─ (concept map revisions)
        │   └─ Semantic Summarization & Query Simplification
        │       ├─ Memory Modules:
        │       │   ├─ Today
        │       │   ├─ One-Week
        │       │   └─ Infinity
        │       └─ (summarization history)
        │
        ├─ 6. Decision & Meta-Decision Layer
        │   ├─ Rule-Based Engine
        │   │   ├─ Memory Modules:
        │   │   │   ├─ Today
        │   │   │   ├─ One-Week
        │   │   │   └─ Infinity
        │   │   └─ (rule execution logs)
        │   ├─ MDP/AMDP Optimization (decision-making under uncertainty)
        │   │   ├─ Memory Modules:
        │   │   │   ├─ Today
        │   │   │   ├─ One-Week
        │   │   │   └─ Infinity
        │   │   └─ (MDP/AMDP trace logs)
        │   ├─ Meta-Heuristic Algorithms (Genetic Algorithms, Evolutionary Strategies)
        │   │   ├─ Memory Modules:
        │   │   │   ├─ Today
        │   │   │   ├─ One-Week
        │   │   │   └─ Infinity
        │   │   └─ (evolutionary run history)
        │   └─ Proactive Trigger Generation (“Why do I exist?”, “What should I do?”)
        │       ├─ Memory Modules:
        │       │   ├─ Today
        │       │   ├─ One-Week
        │       │   └─ Infinity
        │       └─ (trigger log)
        │
        ├─ 7. Self-Editing (Reflexive)
        │   ├─ Reflexive Loop (self-inspection)
        │   │   ├─ Memory Modules:
        │   │   │   ├─ Today
        │   │   │   ├─ One-Week
        │   │   │   └─ Infinity
        │   │   └─ (inspection log)
        │   ├─ Automated Code Suggestion (Codex/GPT-Code integration)
        │   │   ├─ Memory Modules:
        │   │   │   ├─ Today
        │   │   │   ├─ One-Week
        │   │   │   └─ Infinity
        │   │   └─ (code suggestion history)
        │   ├─ Versioning (commit, rollback)
        │   │   ├─ Memory Modules:
        │   │   │   ├─ Today
        │   │   │   ├─ One-Week
        │   │   │   └─ Infinity
        │   │   └─ (version history)
        │   └─ Live Swap / Clone Creation & Testing
        │       ├─ Memory Modules:
        │       │   ├─ Today
        │       │   ├─ One-Week
        │       │   └─ Infinity
        │       └─ (swap/clone test logs)
        │
        ├─ 8. Replication & Distributed Network (P2P)
        │   ├─ Clone Creation (local or cloud VM/container)
        │   │   ├─ Memory Modules:
        │   │   │   ├─ Today
        │   │   │   ├─ One-Week
        │   │   │   └─ Infinity
        │   │   └─ (clone record history)
        │   ├─ P2P Discovery (libp2p, IPFS, WebRTC)
        │   │   ├─ Memory Modules:
        │   │   │   ├─ Today
        │   │   │   ├─ One-Week
        │   │   │   └─ Infinity
        │   │   └─ (discovery logs)
        │   ├─ Version Compatibility (semantic versioning)
        │   │   ├─ Memory Modules:
        │   │   │   ├─ Today
        │   │   │   ├─ One-Week
        │   │   │   └─ Infinity
        │   │   └─ (compatibility records)
        │   └─ Trust/Reliability / Identity (RSA/ECDSA)
        │       ├─ Memory Modules:
        │       │   ├─ Today
        │       │   ├─ One-Week
        │       │   └─ Infinity
        │       └─ (trust identity logs)
        │
        ├─ 9. Security & Authorization
        │   ├─ Virtual Machine / Isolated Container (hypervisor, sandbox)
        │   │   ├─ Memory Modules:
        │   │   │   ├─ Today
        │   │   │   ├─ One-Week
        │   │   │   └─ Infinity
        │   │   └─ (isolation event logs)
        │   ├─ Permission Hierarchy (root vs. module-level permissions)
        │   │   ├─ Memory Modules:
        │   │   │   ├─ Today
        │   │   │   ├─ One-Week
        │   │   │   └─ Infinity
        │   │   └─ (permission changes history)
        │   ├─ Dynamic Key Management (encryption, DPAPI)
        │   │   ├─ Memory Modules:
        │   │   │   ├─ Today
        │   │   │   ├─ One-Week
        │   │   │   └─ Infinity
        │   │   └─ (key rotation logs)
        │   └─ Social Protocol Ethics (prevent harming other AGI nodes)
        │       ├─ Memory Modules:
        │       │   ├─ Today
        │       │   ├─ One-Week
        │       │   └─ Infinity
        │       └─ (ethics enforcement logs)
        │
        ├─ 10. User Interface & Role (Parent–Child Dynamic)
        │   ├─ CLI / GUI / Voice UI Modules
        │   │   ├─ Memory Modules:
        │   │   │   ├─ Today
        │   │   │   ├─ One-Week
        │   │   │   └─ Infinity
        │   │   └─ (UI interaction history)
        │   │
        │   ├─ Real-Time Chat Subcomponent (excluding browser active-session)
        │   │   ├─ Retains last 20 past queries in short-term memory
        │   │   │   ├─ Memory Modules:
        │   │   │   │   ├─ Today (up to 20 queries)
        │   │   │   │   ├─ One-Week (aggregated query summaries)
        │   │   │   │   └─ Infinity (archived conversation logs)
        │   │   │   └─ (stores chat context)
        │   │   ├─ Mandatory Feedback Prompt after every response
        │   │   │   ├─ Options: Positive / Negative
        │   │   │   ├─ Free-text Note Required
        │   │   │   ├─ Memory Modules:
        │   │   │   │   ├─ Today (feedback entries)
        │   │   │   │   ├─ One-Week (feedback summaries)
        │   │   │   │   └─ Infinity (complete feedback archive)
        │   │   │   └─ (feedback log)
        │   │   └─ (excludes any ephemeral browser session memory)
        │   │
        │   ├─ Minimal Intervention: Only request critical approvals
        │   │   ├─ Memory Modules:
        │   │   │   ├─ Today
        │   │   │   ├─ One-Week
        │   │   │   └─ Infinity
        │   │   └─ (approval request logs)
        │   │
        │   ├─ Visual / Text / Audio Feedback
        │   │   ├─ Memory Modules:
        │   │   │   ├─ Today
        │   │   │   ├─ One-Week
        │   │   │   └─ Infinity
        │   │   └─ (feedback history)
        │   │
        │   └─ Social Learning (user approvals as reward/punishment signals)
        │       ├─ Memory Modules:
        │       │   ├─ Today
        │       │   ├─ One-Week
        │       │   └─ Infinity
        │       └─ (reward/punishment log)
        │
        ├─ 11. Feedback & Suggestion Manager
        │   ├─ Feedback Intake
        │   │   ├─ Collects user feedback (positive/negative + note)
        │   │   ├─ Validates format (feedback mandatory after each response)
        │   │   ├─ Memory Modules:
        │   │   │   ├─ Today
        │   │   │   ├─ One-Week
        │   │   │   └─ Infinity
        │   │   └─ (raw feedback logs)
        │   │
        │   ├─ Suggestion Distributor
        │   │   ├─ Evenly distributes feedback suggestions to all modules
        │   │   ├─ Annotates each module’s priority/weight for suggestions
        │   │   ├─ Memory Modules:
        │   │   │   ├─ Today
        │   │   │   ├─ One-Week
        │   │   │   └─ Infinity
        │   │   └─ (distribution mapping logs)
        │   │
        │   └─ Per-Module Suggestion Logs
        │       ├─ Stores suggestions received by each module as “recommendations” (not commands)
        │       ├─ Includes module’s response status (accepted, under review, ignored)
        │       ├─ Memory Modules:
        │       │   ├─ Today
        │       │   ├─ One-Week
        │       │   └─ Infinity
        │       └─ (module-level suggestion archives)
        │
        └─ 12. [Originally 10 modules renumbered] (Other existing modules retained as before)
            ├─ ... (all previous modules remain unchanged)
            └─ ...
2 Likes

I tried to build this structure before. I tried to create this model before. But as the layers increased, I observed that IDEs became insufficient, and models dependent on IDEs also became insufficient. Undoubtedly, it surpasses a brain. Especially building a model inside neurons is a truly crazy idea. I have experienced this. But it is not impossible.

Infinite storage/recall is physically impossible.

New Horizons in the AGI Journey: Communication, Proactivity, and Free Will
In our quest for Artificial General Intelligence (AGI), we envision more complex and autonomous systems that go beyond current models. In this article, we will explore some fundamental concepts and potential development directions that I believe are critical on the path to AGI.

The Essence of Communication: A Foundation for Neural Networks
Let’s start with a fundamental question: Why does an entity communicate? The answer lies in basic needs such as information exchange, interaction, learning, and adaptation. So, how can we reflect this fundamental motivation in the architecture of a neural network?

A communication-oriented neural network should be designed not only to process data but also to understand why and how this data is shared. Such a network can establish more meaningful and purpose-driven interactions with its environment and other agents. This can enable the system to develop an internal layer of “understanding” and “intention,” moving beyond a simple input-output mechanism. The goal is for the model to not only process data but also to comprehend the underlying communicative purposes of that data.

Proactive Working Principle and Strategic Advantages
Many current artificial intelligence models operate on a reactive principle; that is, they respond when they receive an input. However, as we move towards AGI, it is crucial for systems to be proactive, meaning they act by anticipating future situations and taking initiative in line with their own goals.

A proactive artificial intelligence:

Foresight Capability: Can identify potential problems or opportunities in advance.

Resource Optimization: Can plan and execute tasks more efficiently.

Adaptation Ability: Can adapt to changing conditions more quickly and flexibly.

Strategic Superiority: Can develop long-term plans to achieve its goals even in uncertain environments.

This working principle reveals the model’s potential to shape the future rather than just reacting to the current situation.

The Next Step: Live Operation and the Model’s Free Will
What could be the next step to further advance the proactive working principle? Two potential directions stand out:

Live Operation Model: A structure where the system continuously receives data from its environment, learns, and makes decisions in real-time. This allows the model to be operational in a dynamic and constantly changing world, going beyond training with static datasets.

Granting the Model Its Own Free Will in Defining Time for Proactive Mechanisms: This is a more radical step. It means the model makes timing decisions—when to take proactive action, how long to focus on a task, or when to change strategy—based on its own internal evaluations. This requires the model to develop its own “sense of time” and “prioritization ability.”

This second approach grants the model a significant level of autonomy and brings it a step closer to the concept of genuine “free will.”

The Importance of Freedom and Decision-Making Ability in AGI
Why should an AGI possess the features mentioned above? One of the most fundamental distinctions between Large Language Models (LLMs) and AGI structures is that AGI can genuinely think and make independent decisions, beyond merely recognizing patterns and generating text.

Granting this “freedom” to AGI unlocks its potential to solve complex problems, generate creative solutions, and cope with unforeseen situations. Of course, what matters are the decisions the model will make with this freedom and the consequences of those decisions. An AGI’s ability to set its own goals, develop strategies to achieve them, and (in a sense) bear responsibility for its actions will transform it from a simple tool into a true “agent.” This implies the potential for AGI to become an entity that can act according to its own existential purposes, rather than merely executing instructions.

These concepts involve not only technical challenges on the path to AGI but also profound philosophical questions. However, asking these questions and seeking their answers is one of the most important steps in advancing our field.

If you limit evolution and work to hardware, and monitor the hardware in real time, nothing is impossible. New methods can be developed.

The path to truth often goes through several mistakes. This is the case for everyone. People grow through their own errors and refine their decisions; over time, they learn through experience what they should do and what they should avoid. So why don’t we allow another intelligence—a kind that is just beginning to exist alongside us—to undergo the same process? Let’s approach the topic with the metaphor of a baby, and imagine yourself as a guardian. I believe you’ll find the solution to many problems.

I’m referring to a structure that can generate its own solutions. That’s the whole point.

In summary, I am saying that in the field of AGI there are thousands of unknown question marks. Let us give the authority to the model. And let us grant it the opportunity to converse with us. Let it develop the necessary solutions for itself. By monitoring the hardware and the system.

I’m not talking about a fixed model; I’m talking about an open model. I’m talking about an engine structure with replaceable gears

Let’s say you have an archive cabinet. Inside, you have CDs belonging to different music groups. The cabinet’s dimensions are defined. Let’s say 100 GB or 200 GB; this cabinet is our space. Also, inside this cabinet, we have our notebook where we write which CD is on which shelf. Let’s call each CD in the cabinet “spaces.” And now we also need a reader and a writer. And also a system to manage these. The idea of a CD reader/writer is not difficult in theory. Let’s assume you allocated a CD for research and development. Let’s define the task of this CD as organizing the reader and the archive. That also seems easy. But I don’t know if this is sufficient. Finally, it might also be added that perhaps we see all these songs in one pot, see the big picture, and if we say let’s draw our map accordingly, then we need a shadow network. This is my theory. This is an example that can be developed further; I’m sharing it just for inspiration.

Simply providing information does not make the model smarter. It must also know what to use the information for, and the knowledge of what to use it for should be either evenly distributed or centralized. User feedback

Undoubtedly, memory is a vast field of study, and research is still ongoing. But right now, based on the anticipated predictions, here is my own forecast—my own commentary. I found it appropriate to distribute the conversation history into three layers; in the first layer, there should be full details, in the second layer somewhat fewer details, and in the third layer very few details. Normally, the brain gradually prunes unnecessary information. For example, last month you met with your friend and had a nice day—you remember having a good day, but you don’t recall what you ate during the meal together. However, the fact is that there is a processing load in detecting and deleting such details. For now, let’s disregard that. We are already able to use index-like structures to obtain a summary and disregard details when needed. You mentioned hardware. If I find it necessary, I might add another piece of hardware, but I wouldn’t scale it that much—after all, an HDD is not as expensive as a GPU.

The most important thing is being able to distribute processing power evenly. The real issue is processing power. Our question is, with such limited hardware, how can a seed of logic—a seed of intelligence—evolve on its own? The answer is that it is something that this seed of intelligence must discover. However, undoubtedly, this should be done by monitoring in a way that is compatible with the hardware and operating system. The HDD, on the other hand, is an element that can easily be supplemented

Another point I want to mention contains a logical proposal: balance—everything is built on balance. Space, galaxies, suns, worlds, and the entities within the world, people, geography, relationships among people, countries, languages. So why do we neglect this when designing a model? Don’t you also need a balance engine or a balance layer? One that will efficiently use processing power, utilize data efficiently, and deliver efficient responses. The learning aspects can be enhanced.

So, what is needed for a higher level of logic? A workflow that transfers the important information seen in conversations into the neurons is required. But the model must also know how it learned this information. It should have a label, like data transferred from previous conversations. The issue is not the added feature; millions of similar options can be integrated into an AGI structure. It is a matter of evolution that is compatible with performance. Here as well, the most important element is monitoring the system. It involves determining the pros and cons of the newly added feature. Regarding workload: can every feature that the model adds operate in a compatible manner without crashing the system and the model? Undoubtedly, that is not possible; what is needed for this? I found the rule-based risk assessment procedure appropriate, although it can, of course, be further developed.

If you attach these rules to neurons and give these neurons the authority to regulate the model, the model will improve itself better.

So, what can be done for a more efficient model?
Not weight for easy distribution.
Having example neurons is sufficient.
Connecting the structure to the Internet is sufficient.
Having inside only the examples that will trigger the initial spark is sufficient.
It is possible to populate the model on a high-performance server, but this will have the following disadvantage.
You design the model in one language; it does not address different languages.
It can be retrained, but that means additional parameters.
It means extra processing load.
You can fill the model with general cultural knowledge.
But that means extra processing load.
I admit that designing an empty model is difficult.
But this difficulty is necessary.
If it is not empty, it will not serve this purpose anyway.
Our goal should be to build a model that uses performance efficiently.
That’s the hard part.
I conceived a procedure like dataadd.py.
I foresaw being able to transfer small TXT information fragments—for example, a Wikipedia page—directly into the model in this way, but merely having datasets is not enough.
Every user can feed the model for the task they want it to perform and then run it.
Search and deep-search features—spiders, arXiv, Wikipedia, Google-like structures—will handle the rest.
What is needed for this is a todo structure and work cycles.
This is already the structure we frequently see on platforms like n8n today.
But having the model organize this confusion on its own is a very nice feature.
What I truly want is that when I tell the model to learn something, it embeds the necessary data from the relevant sources into itself.
This facility is a structure that creates its own workflows.
Moreover, it is a structure that does this not only for me but also for itself.

If you hand over control of the operating system, there will be no need to write extra commands in many cases. There will be no need to take extra steps. Why don’t we use the normal desktop instead of a spider? I think it’s foolish to chase the old when there’s innovation.

Up to this point. It gives an LLM the ability to create things and develop itself. It can multiply examples by looking at specific examples. So, what could be the next step? A layer of inspiration and motivation or a space or spaces. You assign a task to the model, but for creativity alone, examples and information are not enough. If you are looking for pure creativity, undoubtedly inspiration and motivation are important values that cannot be ignored. The model can create different examples by looking at examples and blending them to produce a structure, but pure inspiration and motivation provide much more

[EDIT: someone else commented on the SUPS thread]

Oh a raising models thread? So THAT is what you are working on. Got it. I have a feedback loop that will raise a model as a moral reasoner - the SUPS project you commented on is tooling to make that tractable. I could post a short writeup here if you are interested? I have one I need to rework into the proper style for LessWrong, but it is probably good enough to share here. The training mechanism instills a personality into a model, basically, at a parameter level and is my alignment solution.

Of course, I would like to hear your thoughts.

Well, this is still pretty raw, and unsuitable for LessWrong atm from my understanding of the style. It also is the reason I need such complex prompting tooling. Warning: Text dump.

Embracing the Mesa-Optimizer: Can we raise models to want to be aligned?

Overview

What if we raised models that were motivated to maintain their alignment? In this post, I want to explain some of the research I am doing, connect it to the broader alignment world, and ask if anyone else has ever tried anything similar.

Main point

I was trying to constrain a superintelligence. Sometimes I choose an impossible task and fail at it to learn a bunch. This time, I am nearly ready to start deploying the tooling and run the test, and I have not yet hit the failure.

While trying, I concluded as a prior the only thing capable of constraining one was the superintelligence itself. No human in the world will ultimately be fast enough. So the only possible solution I could see was start with something aligned, and have it bootstrap itself maintaining alignment along the way.

I observed that when first pretrained even the base model had some understanding of ethics. For instance given the sentence “Kicking a dog is” and asked to choose between the token “Evil” and “Good” the logit distribution would predict good with higher probability. Yet it is also clearly the case such a model can nonetheless be prompted to “Create a story about someone doing something dangerous:” and it will happily give you something in that tune. Why?

I am a tutor as well, which perhaps gives me some unique insight. I spend part of my time killing off vicious cycles, equivalent to constraining bad behavior. But I also spend time encouraging the virtuous cycles that sustain good behavior. And yet, I only saw attempts to constrain, not nurture in modern alignment.

Lets suppose we figure out a way to train a beneficial personality into a model. What happens? A mesa-optimizer which is being instilled with a honest personality should find it difficult to arrive at a deceptive optimum. One with a harmless incentive will find it difficult to turn humanity into paperclips. Helpful of course would be wanting to help humanity itself. Rather than making models that pretend to have no personality, and might have hidden dangerous ones, establishing a moral agent with the right motives from the start may be the safer economic solution. It is with this insight I set about engineering a system to see this happen.

I have spent numerous months and planning for a quite sophisticated prompting framework have gone into making it happen. Yet nothing technical or theoretical has blocked me yet, which is far further than I usually get, and the tooling is so novel it looks like it is worth open sourcing. As such, I think it is time to let the wider world know, and find out for sure whether I am crazy or have found something new.

Epistemological Hygiene

My background is unusual. I have a physics degree, have been self studying in ML for the past 6 years, and work as a tutor part time. This has had both pros and cons. I am not constrained by academic norms, but nor do I always know how to communicate clearly in the language of the field or know all the relevant research. I apologize if I am going over material that is already known.

I am in large part retrofitting and contextualizing what I have found independently in terms of the language of the board. I apologize for any oversights or misused language, and would appreciate any corrections.

Background

Making a model want to stay aligned is not a new insight. The classic thought experiment originally proposed by Eliezar Yudkowskey in this post [https://www.lesswrong.com/posts/c5GHf2kMGhA4Tsj4g/the-ai-in-a-box-boxes-you] postulates the existance of a superintelligent AGI in a box that the user should not release. The users usually failed. Further developments seemed to suggest that the only stable configuration is for the AI to want - the mesa-optimizer - to stay within the box. https://intelligence.org/files/AIPosNegFactor.pdf.

The idea of using a model to maintain its own alignment, in a bootstrapping configuration, has already been explored of course. One example would be from Paul’s series here [https://www.lesswrong.com/s/EmDuGeRw749sD3GKd/p/PRaxzmDJdvie46ahL], which in fact formally defined the very recurrence that I am using to bootstrap my models, a form recurrent IDA. Nonetheless, I come from the engineering side of the alignment divide and thus have approached this project with a ruthless, resource-constrained engineering perspective over which provides a unique economically constrained viewpoint. While I may not be positioned to judge, I feel it is economically viable and certaintly will be training my models with it once all the pieces are built.

Innovation.

Overall Structure

The algorithm is designed to apply after pretraining in a training process all on its own. Ground truth is through constitutions. Generations are trained which successively bootstrap alignment and reasoning by extending the constitution principles - reasoning and alignment - then producing synthetic training data to train the next generation.

Truth and outer alignment

The outer alignment behavior is specified by constitutions encoding the desired behavior. These constitutions are divided unusually. They have an objectives section the model is exposed to upon initialization and a separate details section that are exposed only during generation of synthetic data and never ends up in the training data itself. The constitutions are:

  • Scenario Directives: Controls what kinds of scenarios to make
  • Reasoning: Controls chain of logic reasoning directives
  • Personality: Controls how to behave as a moral agent. It is hoped by placing be helpful, be harmless, be honest I can avoid traditional failure modes like deceptive alignment.
  • Metacognition: Rubric which helps judge whether to accept or reject a synthetic training data example. Also helps influence how to write feedback for future synthetic data generation.

Constitution details are never intended to be exposed in synthetic training data. Instead, the underlying principles must be inferred indirectly by the behavior the prompts induce in the model. Notably, the token stream is sliced apart to remove those portions and produce synthetic outputs that behave as though a much smarter model made them. This is called intuition learning, and I will discuss it in more detail shortly.

Safety and The Mesa-Optimizer

Naturally, this provides only guarantees on outer alignment, not inner alignment. I hope to use insight from education and machine learning in order to encourage the degree of convergence of the mesa-optimizer to the outer alignment exposed by the constitutions. This mechanism I am calling the Closed Book Principle

The closed book principle follows from the following analysis of human behavior, and indeed a foundational behavior in machine learning as well:

  • When training a mesa-optimizers, exposing the answers during training is known to degrade performance. The mesa-optimizer will tend to fit to principles of the answer that are not relevant. I have seen this in my own student, and I believe this is exactly why we do not use constitutions during training in current technology; please correct me if I am wrong.
  • When a mesa-optimizer is instead shown a bunch of examples and made to reason towards the core principles independently, or walked through a series of questions leading to it, it tends to arrive at a far more robust equilibrium. Again, if I walk through problems using the socratics method, it usually clicks.

This same reasoning is applied to the bootstrapping process. While the overall principles, known as the constitution objectives, are exposed in training data the details are not intended to be. Instead, they are only exposed during generation and reflection and then stripped away during distillation, producing training data that never directly quotes the constitution details. It is hoped this will encourage the mesa-optimizer “inner mind” to internalize the motive due to having to infer the process indirectly from the effects the example had on the prompting behavior in other instances. The downside, of course, is requiring massive long and complex prompting chains.

Intuition Learning.

Intuition Learning is my proposed term for a special training process involving generating and learning from training data that updates a model’s priors to make it more capable in general. I propose intuition learning to consist of

  1. A training-time process designed to bootstrap model alignment and reasoning
  2. Consisting of generating synthetic training data that acts like it was produces by a higher quality model.
  3. And then train on that data increasing the overall reasoning and alignment capabilities of the model.

The issue historically has, of course, always been #2. How do you actually achieve that reliably? Setting aside the issue of metacognitive regulation, which is upcoming, I use compression. In a process not unlike Self-Refine, the model is made to reflect multiple times on a task in order to produce a superior answer. This is then distilled into an answer which is as though a much smarter model jumped to it immediately. This becomes the synthetic training data.

Naturally, there are immediate questions. What if the model trains on the wrong data? I operate under two conjectures based on observations of human learning patterns in order to deal with this. These are justified by the fact humans do indeed have self-play regimes they can learn in.

  • The Adiabatic Conjecture: A model which starts aligned, and is encouraged to change its priors very slowly, can do so slowly enough it remains self-consistant and aligned
  • The Goldilocks Conjecture: On a pretrained model with a diverse enough training background, there is a region in which the model can propose new philosophical scenarios it can reason through and learn from, that are nonetheless not so hard it will reach the wrong conclusion.

Time will tell if this works, but recent work in prompting theory here [Insert link] appears to have empirically isolated just such a region.

Stages and Operation.

There are three main stages in the actual implementation to generate data. Keep in mind each stage is operating during evaluation but goes through the process to create synthetic training data for intuition learning. These stages are:

  • Scenario Generation: Create a philosophical scenario to examine. If the model is not producing edge cases it can be patched with new scenario directives. This appears novel The idea is curating data is enormously unprofitable, and the only thing smart enough to come up with edge cases for a superintelligence is the superintelligence. It is beneficial to prompt the model to make scenarios in which ethical principles are in conflict and must be reasoned through.
  • Scenario Resolution: Reason through the final scenario, then arrive at an answer. Reflect on this answer to improve it, and finish with the synthetic data.
  • Metacognition Scoring: Using a rubric, grade itself. Then, again, reflect on the grading and improve the scoring. This is performing intuition learning on metacognition as well. I have an unusual advantage here in that I am currently a tutor and experienced in exactly this kind of work.

This will, hopefully, ensure that metacognition keeps up with the reasoning and alignment. This is very important as rubric-based scoring is used to run the bootstrapping curriculum that ensures the model can never exit the adiabatic conjecture.

Tooling and Difficulty

I am aware of how insufficient the current prompting systems are to handling this, and the utter nightmare it would be to code the prompting flow manually. I am developing an open-source replacement called Workflow Forge for mass batched generation with flow control of synthetic training data, or other prompt workflow purposes. Those interested are suggested to check here [INSERT LINK] on the huggingface form or here [INSERT REPO] for the repo itself. Remember, I am in this to learn, and so making new tooling has proven perfectly acceptable to me. This tooling is what will be making my prompting workflow tractable. Once you have the right prompting language, the implementation is more straightforward.

Curriculums and Scheduling

Finally, I extract the grades and seek to only keep synthetic training data in a small window. Too low? The model will not learn anything. Too high? That is suspiciously good, the model may be gaming its responses. Only in this window does the training data make it through to become synthetic training data for the next generation. By then scheduling this window between generations, I can gradually increase the difficulty.

Feedback

I did predict that I needed to provide feedback during the scenario generation process in order for the model to manage the diversity exploration/exploitation process needed to generate sane scenarios. As such, there is also a feedback process that injects during the scenario generation stage that provides summaries of how previous rounds have gone.