Semantic Axis Decomposition of Transformer Embeddings — Interpret Meaningful Dimensions via Heatmaps

Hey everyone!

I’ve been working on a method to interpret sentence-transformer embeddings by identifying semantically meaningful dimensions — like emotionality, scientificness, or question-intent.

:magnifying_glass_tilted_left: The process involves:

  1. Using a Random Forest classifier to find the most important embedding dimensions for various semantic labels.
  2. Manually assigning meaning to the top dimensions (e.g., dim_17 = emotionality).
  3. Visualizing sentence activations using a semantic heatmap to show how these latent axes behave across inputs.

:page_facing_up: Here’s the paper (with examples): Semantic Axis Decomposition of Transformer Embeddings
:light_bulb: Idea: This could be a practical path toward embedding-level explainability in transformers.


:brain: Experimental Extension — Semantic Capsules and Z-Axis Stacking

As a conceptual extension, I’m also exploring a layered structure along a Z-axis, where each embedding dimension becomes a capsule — a vector of nested sub-dimensions, each encoding a deeper or more orthogonal semantic layer.

Instead of treating dim_42 as a single scalar, we represent it as:
dim_42 → [layer_0, layer_1, ..., layer_z]

The model can then dynamically choose the depth of interpretation, effectively “thinking deeper” when needed. This introduces:

  • Recursive semantic reasoning
  • Modular interpretability
  • Potentially unlimited semantic granularity

Each transformation is recursively defined as:
E^(z) = f^(z)(f^(z-1)(...f^(1)(E^(0))))

We tested this by applying capsule-layered transformations and visualizing the shifts using PCA. Even without training, it produced measurable semantic structuring.


:telescope: Long-Term Vision

This capsule-based axis system could evolve into a framework where embeddings are expandable semantic structures, not static vectors.
The model learns to select meaningful depth within each dimension — building context-aware cognition.

In theory, this allows limitless intelligent reasoning, bounded only by compute — not by fixed architecture.


:construction: What’s next

  • Preparing a demo in Streamlit / Hugging Face Spaces
  • Planning interactive embedding editor with real-time axis activation
  • Considering open-sourcing a toolkit for semantic dimension discovery & visualization

Would love your thoughts:

  • Is this useful or interesting to you?
  • What experiments or edge cases would you test?
  • Anyone interested in collaboration?

Thanks for reading! :folded_hands:

:link: GitHub: GitHub - kexi-bq/embedding-explainer: Interactive editor for text meaning via embedding vector control

1 Like