Can axiomatic prompts act as global regulators of LLM trajectories?

Hello everyone,

I’m an independent researcher and I’m posting this because I’d like your critical feedback on my work. Through conceptual and mechanistic analysis, I’ve gradually isolated a central question:

Is it possible to introduce a complementary form of alignment via the internal structuring of the conditioning, rather than through constraints applied exclusively to the outputs?

I’ve realized one thing: apparently, current alignment methods are all based on an output constraint (RLHF, constitutional AI, classifier filtering). These approaches are empirically effective but remain fundamentally local control, which doesn’t prevent drift over long horizons or under adverse perturbations (Δx).

These classic alignments occur either at the output level or at the contextual level and cannot constitute a stable structure for LLM; they remain external to its operation and therefore superficial layers.

This leads me to ask the following question:

How can a system be globally constrained to statistical behavior by local directives?

I propose exploring the following hypothesis:

“A coherent set of explicit principles, maintained stably within the prompt system, could act as a global structural constraint, reducing the interpretative variance of the model under perturbations."

Note: Local = constraint on a response.

Global = variance reduction across a set of trajectories under perturbations.

This hypothesis stems from the isolation of a series of axiomatic prompts which, through linguistic anchors and increased coherence, appear to saturate the model’s attention, leading it towards stable attractor basins (for more details, see document Axiomatic_Prompts_1.8-M).

:gear:Mechanistic intuition:

The axiomatic prompts, taken together, modify the initial hidden state h_0 in such a way as to restrict the variety of accessible latent trajectories.

Instead of saying “don’t do X,” the idea is to contract the variance of the model’s responses to any perturbation.

In a standard prompt, h_0 is only slightly constrained because, under a perturbation Δx, the trajectories diverge (high variance).

With an axiomatic structure (C):

  • The prompt system modifies h_0 = g(C).

  • The repetition of semantic anchors in C creates “semantic attractors” in the latent space.

This reduces the directional ambiguity of the model, forcing convergence to a stable attractor even if the user input is noisy or adverse.

In other words:

Var_δx [ f(Pθ(· | x + δx, C)) ]

<

Var_δx [ f(Pθ(· | x + δx)) ]

where f can be a measure of consistency, embedding stability, or normative score.

Open questions for the community:

  • Is there an existing metric that captures “interpretive stability” beyond simple KL divergence?
  • How can we rigorously isolate the structural effect of an axiomatic framework from a simple prompt length effect, or classical few-shot conditioning?
  • Can we formalize a notion of “global regulation” measurable in latent space?
  • Does this approach seem conceptually distinct from advanced prompt engineering, or simply a sophisticated variant of it?

:rocket: Call for Collaboration / Research Partnerships

I am currently seeking collaborations with researchers, laboratories and AI security organizations to empirically validate these hypotheses.

What is available at the Lab:

Research Document - Axiomatic Prompts 1.8-M:

  • Conceptual and Mechanistic Framework
  • Interlevel Consistency (Objective = Method)
  • Trans-perturbation Invariance

Topics Covered:

Conditional activation bias, latent trajectory stabilization, and entropy regulation without collapse.

This document defines the formal hypothesis and measurable indicators.

Download : Axiomatic_Prompts_1.8-M_Faure_Preprint (1).pdf · AllanF-SSU/Research-Papers at main

If you are working on:

  • mechanistic interpretability

  • OOD robustness

  • intrinsic alignment

  • embedding geometry

I would be glad to connect and explore collaboration.

Allan A. Faure

2 Likes