Hey everyone!
I’ve been working on a method to interpret sentence-transformer embeddings by identifying semantically meaningful dimensions — like emotionality, scientificness, or question-intent.
The process involves:
- Using a Random Forest classifier to find the most important embedding dimensions for various semantic labels.
- Manually assigning meaning to the top dimensions (e.g.,
dim_17 = emotionality
). - Visualizing sentence activations using a semantic heatmap to show how these latent axes behave across inputs.
Here’s the paper (with examples): Semantic Axis Decomposition of Transformer Embeddings
Idea: This could be a practical path toward embedding-level explainability in transformers.
Experimental Extension — Semantic Capsules and Z-Axis Stacking
As a conceptual extension, I’m also exploring a layered structure along a Z-axis, where each embedding dimension becomes a capsule — a vector of nested sub-dimensions, each encoding a deeper or more orthogonal semantic layer.
Instead of treating dim_42
as a single scalar, we represent it as:
dim_42 → [layer_0, layer_1, ..., layer_z]
The model can then dynamically choose the depth of interpretation, effectively “thinking deeper” when needed. This introduces:
- Recursive semantic reasoning
- Modular interpretability
- Potentially unlimited semantic granularity
Each transformation is recursively defined as:
E^(z) = f^(z)(f^(z-1)(...f^(1)(E^(0))))
We tested this by applying capsule-layered transformations and visualizing the shifts using PCA. Even without training, it produced measurable semantic structuring.
Long-Term Vision
This capsule-based axis system could evolve into a framework where embeddings are expandable semantic structures, not static vectors.
The model learns to select meaningful depth within each dimension — building context-aware cognition.
In theory, this allows limitless intelligent reasoning, bounded only by compute — not by fixed architecture.
What’s next
- Preparing a demo in Streamlit / Hugging Face Spaces
- Planning interactive embedding editor with real-time axis activation
- Considering open-sourcing a toolkit for semantic dimension discovery & visualization
Would love your thoughts:
- Is this useful or interesting to you?
- What experiments or edge cases would you test?
- Anyone interested in collaboration?
Thanks for reading!