POC - Model inspirated from fluid dynamics

This is an exploration of a novel architecture inspired by fluid physics, with interpretability via the underlying equations.

The Advection-GLU Architecture

The model’s FFN is a GLU-type bilinear branch: first, RoPE (slow base 16180) is applied to z, then an element-wise product W_u(\cdot) \odot W_g(\cdot) is computed and transformed into an advection term modulated by the viscosity \nu, before being projected by C (with LN in legacy mode).

Formula (legacy mode, pre_ln=False):

z_{glu} = \text{RoPE}_{16180}(z)
\text{bilinear} = W_u(z_{glu}) \odot W_g(z_{glu})
\text{adv} = -(\alpha_{adv}\nu) \cdot \text{bilinear}
\text{FFN}(z) = C(\text{LN}(\text{adv}))

Advection is the mechanism that transforms the output of the bilinear FFN (GLU) into a “transport” term applied to the latent z, with an intensity controlled by a context-dependent viscosity \nu.

  • Step 1 (preparation): A slow RoPE is applied to z to obtain z_{glu}.
  • Step 2 (bilinear field): A field W_u(z_{glu}) \odot W_g(z_{glu}) is computed.
  • Step 3 (advection): This field is multiplied by -\alpha_{adv}\nu (the “−” sign enforces an advection-like dynamics).
  • Step 4 (residual injection): It is projected by C (and LN in legacy mode), then added to z via \alpha_{mlp}.

Training Observations

The main observation regarding the model’s behavior after training is its syntactic, orthographic, and grammatical precision, as well as its strict adherence to the overall structure of the language.

This is a working proof of concept (POC) that requires further adjustments and exploration. The model is highly stable and converges smoothly during training.

Project Context

I do not come from a computer science or machine learning background; I simply had an intuition about the role of fluid mechanics in artificial intelligence. Interestingly, certain parts of the Navier-Stokes equations revealed a specific capability for fast learning when it comes to spelling and syntactic structure. The model grasps the rhythm and the overall form of the language, making it seem almost as if it were natively designed for robotics.

1 Like