I’m not a developer or mathematician — I’m a systems administrator. I had an intuition about how meaning might organise itself through phase relationships and wave mechanics rather than vector distances. I collaborated with AI (Claude) to formalise and test those ideas, and the result aligned with published research by Listopad (2025) — Wave-Based Semantic Memory with Resonance-Based Retrieval (arXiv:2509.09691).
The core idea: semantic relationships encoded as harmonic waveforms on circular embeddings, with retrieval through constructive interference rather than cosine similarity.
Repo: GitHub - atech-hub/Wave-Coherence-as-a-Computational-Primitive: Harmonic coherence as a universal relationship detection operator
Sharing it here in case anyone working in embedding spaces or retrieval finds it interesting or wants to take it further.
Hi, I just upload a version of the tests in Phyton. I’m sure this will make easier for the comunity.
Regards
Update: we’ve added Test 21 which shows cosine similarity returning 0.0 where harmonic sweep finds 1.0 on controlled data. The open question is whether real model embeddings contain this harmonic structure. If anyone has run the sweep against actual embeddings, I’d genuinely like to know the result — positive or negative.
Update: We ran the test ourselves — results are positive.
We used all-MiniLM-L6-v2 (384 dimensions) and applied spectral coherence analysis to real embeddings across 44 words in 6 relationship groups.
Cosine similarity cannot distinguish antonyms from synonyms. Look at these scores:
┌────────────────────┬───────────┬────────┐
│ Pair │ Type │ Cosine │
├────────────────────┼───────────┼────────┤
│ big / large │ synonym │ 0.81 │
├────────────────────┼───────────┼────────┤
│ fast / slow │ antonym │ 0.75 │
├────────────────────┼───────────┼────────┤
│ big / small │ antonym │ 0.68 │
├────────────────────┼───────────┼────────┤
│ happy / joyful │ synonym │ 0.68 │
├────────────────────┼───────────┼────────┤
│ happy / sad │ antonym │ 0.37 │
├────────────────────┼───────────┼────────┤
│ banana / democracy │ unrelated │ 0.19 │
└────────────────────┴───────────┴────────┘
“big/small” (opposites) scores the same as “happy/joyful” (same meaning). “fast/slow” (opposites) scores higher than some synonyms. The model knows these words are related but cosine similarity cannot express how — it has one number for everything.
Spectral variance can. When we decompose the embeddings via FFT and measure coherence per frequency band (rather than summing everything into one dot product), the variance across bands is:
- Synonyms: 0.0031 (flat — uniformly coherent across all bands)
- Antonyms: 0.0094 (3x higher — coherent in some bands, anti-coherent in others)
- Unrelated: 0.0215 (7x higher — incoherent noise)
Synonyms are coherent everywhere. Antonyms are coherent in some frequency bands but opposed in others — that opposition is the information cosine similarity sums away.
Different relationship types have distinct spectral profiles. Hierarchical pairs (animal→dog, vehicle→car), functional pairs (doctor→hospital, chef→kitchen), and analogical pairs (king→queen, father→mother) each produce a different shape of coherence across frequency bands. These are relationship-type fingerprints that a single cosine score destroys by summing.
Nobody designed this model for harmonic structure — it learned band-specific coherence patterns through gradient descent on sentence similarity. Our harmonic analysis framework provides the tool to detect what’s there but invisible to the standard comparison measure.
The analysis script is available at: python/embedding_analysis.py in the GitHub - atech-hub/Wave-Coherence-as-a-Computational-Primitive: Harmonic coherence as a universal relationship detection operator . Run it yourself with pip install sentence-transformers && python embedding_analysis.py.
24 tests, 5 corrective findings, all passing. Tagged as Release Real Embedding Validation — Cosine Similarity Blindness Confirmed in Production Models · atech-hub/Wave-Coherence-as-a-Computational-Primitive · GitHub.
Update 2: We built a transformer without tokens — harmonic embeddings match trained baseline when completely frozen
Following our previous update where we showed cosine similarity is blind to harmonic structure in all-MiniLM-L6-v2 embeddings, we asked the next question: if the structure is already there in trained models, what happens when you provide it from the start?
The experiment
Three identical character-level transformers (4 layers, 128 dim, 4 heads) trained on Shakespeare. No tokenizer. No BPE. Raw characters mapped to phase angles on the unit circle, embedded via harmonic expansion: [cos(theta), sin(theta), cos(2theta), sin(2theta), …]
┌────────────────────────────────────────────────────────────────┬──────────┬─────────────┐
│ Mode │ Val Loss │ vs Baseline │
├────────────────────────────────────────────────────────────────┼──────────┼─────────────┤
│ Baseline — random Gaussian init, trainable (industry standard) │ 1.5570 │ — │
├────────────────────────────────────────────────────────────────┼──────────┼─────────────┤
│ Harmonic — phase-encoded init, trainable │ 1.5223 │ -2.2% │
├────────────────────────────────────────────────────────────────┼──────────┼─────────────┤
│ Frozen — phase-encoded, NOT trainable │ 1.5567 │ -0.02% │
└────────────────────────────────────────────────────────────────┴──────────┴─────────────┘
Why this matters
Harmonic beats random at every single checkpoint. From step 0 through step 5,000, the structured initialization leads and the gap never closes. The model starts closer to the answer because the geometric relationships between characters are built in, not discovered through gradient descent.
The frozen result is the headline. Zero gradient updates to the embedding layer. 40,768 fewer trainable parameters. The model generates coherent Shakespearean dialogue using nothing but fixed cos(n * theta) vectors. The geometry alone carries the signal.
This means the standard approach — initialize with random noise, burn GPU cycles for the model to discover structure — is doing unnecessary work. The structure the model converges toward (as we showed in our previous post with spectral analysis of all-MiniLM-L6-v2) can be provided for free by construction.
And no tokens were needed. 65 characters. 65 phase angles. 128-dimensional harmonic vectors. No vocabulary table, no tokenizer training, no subword merges. The question of “how many tokens” becomes “how many characters” — and the answer is: however many are in the text.
The three-test chain
┌───────────────────────────┬───────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Test │ What it proved │
├───────────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Test 21 (synthetic) │ Cosine similarity is blind to harmonic structure — per-channel sweep recovers what dot product destroys │
├───────────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Test 24 (real embeddings) │ Real model vectors contain this structure — 3x spectral variance difference between synonyms and antonyms │
├───────────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Test 25 (this result) │ Providing the structure from the start beats learning it from random noise — and works even when frozen │
└───────────────────────────┴───────────────────────────────────────────────────────────────────────────────────────────────────────────┘
Try it yourself
pip install torch --index-url https://download.pytorch.org/whl/cu128
cd python
python harmonic_transformer.py
Requires a CUDA GPU. Shakespeare dataset downloads automatically (~1MB). Trains in ~10 minutes on a consumer GPU.
All code is open source: GitHub - atech-hub/Wave-Coherence-as-a-Computational-Primitive: Harmonic coherence as a universal relationship detection operator , tagged as Release Harmonic Transformer — A Model That Doesn't Need To Learn Its Embeddings · atech-hub/Wave-Coherence-as-a-Computational-Primitive · GitHub .
25 tests, 5 corrective findings, all passing.
I wonder what harmonic decomposition would show on orbital frequency data.
Update — March 2026
The framework has reached its current ceiling at our available compute scale (RTX 4070 Ti, character-level Shakespeare):
-
Phase C result: 98.1% of MLP performance at 44% of parameters (Kerr-ODE + maestro bottleneck + progressive curriculum)
-
34 experimental phases, 64 defensive engine patterns, 7 corrective findings
-
Key findings since last update: wave-native FFN layers converge toward MLP with depth, a learned global coordination bottleneck (maestro) provides consistent improvement at all depths, and the ODE structure provides implicit regularisation — stable where MLP overfits at high parameter counts
-
Honest boundary: scaling beyond this point requires compute resources we don’t have. The architecture properties are measured and the blueprint is complete for anyone with a cluster to test at LLM scale.
Repository: https://github.com/atech-hub/Wave-Coherence-as-a-Computational-Primitive DOI: 10.5281/zenodo.18820365
Kerr Engine — Pure Rust training engine for Kerr-ODE transformers (3x faster than PyTorch at 128-dim, full GPU via WGSL)
Released the training engine for the Wave Coherence project as a standalone repo under Apache 2.0.
What it is: a specialised Rust engine for training and running Kerr-ODE transformers — the architecture that replaces dense MLP layers with physics-inspired wave propagation (98.1% of MLP at 44% of parameters).
Key numbers:
-
3x faster than PyTorch+CUDA at 128-dim, on CPU alone, GPU off
-
Full GPU backward pass at 768-dim via 13 WGSL compute shaders (NVIDIA, AMD, Intel, Apple — no CUDA dependency)
-
Hand-derived analytical gradients verified against PyTorch autograd (max diff 7.63e-6)
-
6,500 lines of Rust, 4 dependencies,
cargo build --releaseand you’re done
The WGSL backward pass shaders (attention backward, batched outer product, batched linear backward) appear to be the first open-source implementations of ML training backward passes in WGSL.
Repo: GitHub - atech-hub/kerr-engine · GitHub Parent project: GitHub - atech-hub/Wave-Coherence-as-a-Computational-Primitive: Harmonic coherence as a universal relationship detection operator · GitHub
Note: The Kerr-ODE is a novel architecture — it doesn’t work with LM Studio, Ollama, or llama.cpp today. The engine trains and runs inference natively. Ecosystem connector patterns are documented and published as prior art (Pattern 68) for anyone who wants to build bridges.
Built with AI collaboration (Claude Desktop + Claude Code). Stated openly.