Semantic Phonons: Lattice Vibrations in AI Internals

I apply the mathematics of phonon modes to study semantic representations in language models. I show theoretical predictions and verify them in GloVe and Gemma 2B.

Phonons refer to collective vibrational modes in crystals, propagating through the lattice as a wave.

Introduction

One of the most pressing questions in AI research is how language models represent meaning. How does a model know that “excellent” is better than “good”? How does a model represent time scales, e.g., hour vs. day? Where does the knowledge live that July comes after June? Language models are trained to predict text, and they are very good at doing so, but we have very little knowledge about the actual internal representations. Figuring this out is essentially the holy grail of mechanistic interpretability – the quest of opening the black box and understanding what’s going on inside.

There is now growing evidence that semantics are not stored as arbitrary, unstructured patterns. Instead, meaning seems to be encoded through geometry. A recent example comes from Karkada et al., who showed that the twelve months of the year form a near-perfect circle in the activation space of a language model (Gemma 2B). The months are not just close to each other in some vague sense, but are arranged in an ordered loop, reflecting the cyclical structure of the year. Clearly, the question is whether this result is essentially a coincidence, or reflecting something much deeper.

Without noting it explicitly, what the authors did in their paper is actually to apply mathematical frameworks established decades ago in the field of solid-state physics. In particular, the study of phonon modes. Phonons describe how vibrations propagate through crystals, and the mathematics behind them provides a rich framework for understanding collective behavior on discrete lattices. The case of the months in the year is an example of one of the simplest structures one can deal with: periodic boundary conditions, where the first and last atom of the chain are coupled to each other, as if bending the chain into a ring. Crucially, physics offers an entire arsenal of further-reaching frameworks and tools about how different modes might interfere with one another. If the geometry of something like the months of the year can be described through this lens, there is perhaps a chance to derive significantly more semantic structure and if this holds, this could offer a chance to gain a principled, theoretically grounded toolkit for analyzing the internal structure of AI systems – something the field has long dreamed of. I am currently working out how this can be achieved and have already found several further angles through which it works. This is very recent work, but in this post, I want to lay out the core idea and share some early experimental results.

Background

What are phonons?

A crystal is not a static object. Every atom sits in a potential well created by its neighbors, and at any nonzero temperature, these atoms vibrate. The vibrations do not happen independently: since the atoms are coupled, a disturbance at one site propagates through the lattice as a wave. These collective vibrational modes are called ‘phonons’. The key insight is that even though the underlying system is a discrete lattice of atoms, the collective behavior can be described by smooth wave functions. A phonon with wavevector $k$ has a spatial profile $\phi(x) \propto e^{ikx}$, and the frequency $\omega$ depends on $k$ through a dispersion relation. What makes phonons particularly useful as a conceptual tool is the role of boundary conditions. The shape of the vibrational modes depends not just on the coupling between atoms, but on what happens at the edges of the system. Different boundary conditions produce qualitatively different mode shapes, and each corresponds to a distinct geometric prediction.

Mapping to AI internals

In the phonon picture, the “atoms” of the lattice are words or tokens, and their positions in the embedding space play the role of atomic displacements. A semantic concept like “levels of quality” defines a one-dimensional chain of tokens ordered by meaning: worst, bad, mediocre, okay, good, excellent, best. The embedding vectors of these tokens are the “displacements” of the atoms.

The mathematical justification for this starts with an “old” result in natural language processing. Word embedding models like word2vec and GloVe learn a word embedding matrix $\mathbf{W}\in \mathbb{R}^{C\times d}$ whose $i$-th row $\mathbf{w}_i$ is the $d$-dimensional representation of word $i$. The objective, explicit or implicit, is to predict co-occurrence: how often word $i$ appears within a fixed context window of word $j$. As shown by Levy and Goldberg (2014), a skip-gram model with negative sampling (word2vec architecture) implicitly factorizes the pointwise mutual information (PMI) matrix shifted by a constant:

\[M_{ij}^* \approx \log\frac{P_{ij}}{P_i P_j} = \text{PMI}(i,j),\]

where $P_{ij}$ is the empirical co-occurrence probability and $P_i, P_j$ are unigram probabilities. GloVe explicitly regresses on $\log P_{ij}$, which is proportional to PMI up to marginal terms. Both models are therefore spectral methods on the PMI matrix. To see what this implies geometrically, consider the eigendecomposition $\mathbf{M}^* = \mathbf{\Phi}\mathbf{\Lambda}^* \mathbf{\Phi}^\top$. The trained embeddings then take the form

\[W_{(ij)}= \Phi_{(ij)}\sqrt{|\Lambda_{\mu\mu}^*|}.\]

This means that the $\mu$-th principal component of the word embedding directly encodes the $\mu$-th eigenmode of $\mathbf{M}^*$. Understanding the geometry of word representations therefore reduces to understanding the spectral structure of the PMI matrix.

A natural follow-up question is whether this connection carries over to modern language models, which are far more complex than word2vec or GloVe. Recent work suggests that it does, at least as a first approximation. Cagnetta and Wyart (2024) showed that transformers trained on hierarchical data first learn to exploit short-range token correlations before progressively resolving longer-range ones, effectively building deeper representations of the data structure as the training set grows. Complementarily, Rende et al. (2024) demonstrated that transformers learn many-body token interactions in order of increasing degree, with pairwise co-occurrence statistics of the kind captured by PMI being acquired first. This learning hierarchy suggests that the PMI geometry forms a kind of scaffold on which richer representations are later built. Empirical validation for this picture was recently provided by Karkada et al., who showed that PMI-derived geometric predictions persist in the internal activations of Gemma 2B. The spectral structure of co-occurrence is not just an artifact of shallow embedding models; it appears to be a feature that survives into the deeper layers of modern architectures.

The phonon framework gives a physical interpretation to this spectral structure. A phonon mode corresponds to a direction in embedding space along which tokens are coherently displaced, and that direction is precisely one of the eigenmodes of the co-occurrence matrix. If the tokens are arranged according to a smooth vibrational mode $\phi_n$, we expect their principal components (PCs) to trace out the shape of that mode: the first PC gives $\phi_1$, the second gives $\phi_2$, and so on. The question becomes: which mode shapes do we actually observe, and which boundary conditions do they correspond to?

Boundary conditions

Different boundary conditions. Karkada et al. have used periodic boundary conditions for the circular structure of months of the year. Different semantic concepts can be represented by alternative boundary conditions.

Different boundary conditions impose different constraints on how vibrations can look at the endpoints of a chain, and each produces a characteristic geometric signature. The starting point is always the same: we look for modes satisfying

\[\phi^{\prime\prime}(x) = -k^2\phi(x),\]

which has the general solution $\phi(x) = Ae^{ikx}+Be^{-ikx}$ or equivalently $\phi(x) =A\sin(kx)+B\cos(kx)$. The boundary conditions select which values of $k$ are allowed and fix the ratio $A/B$, determining the mode shapes. There is a vast number of boundary conditions that may be interesting moving forward, but I will here lay out just a few of the most important ones.

Periodic boundary conditions

The simplest case is periodic boundary conditions, which connect the two endpoints of the chain as if bending it into a loop. The conditions $\phi(0) = \phi(L)$ and $\phi^\prime(0)=\phi^\prime(L)$ require the function and its derivative to match at both ends. Starting from the complex form of the general solution, both conditions are satisfied when $e^{ikL}=1$, which forces $k_n=2\pi n/L$ for integer $n$. The resulting modes are complex exponentials

\[\phi_n(x) = e^{2\pi inx/L},\]

whose real and imaginary parts give $\cos(2\pi nx/L)$ and $\sin(2\pi nx/L)$. For the first mode ($n=1$), the two components satisfy $\cos^2+\sin^2 =1$. This gives the geometry shown by Karkada et al., where the month tokens trace out a circle in the $(\text{PC}_1, \text{PC}_2)$ plane. Other semantic concepts that would be natural candidates for this boundary condition include days of the week, hours of the day, and similarly cyclic concepts.

Dirichlet boundary conditions

Dirichlet boundary conditions pin the displacement to zero at both endpoints: $\phi(0) = 0$ and $\phi(L) = 0$. The condition at $x=0$ gives $B=0$, leaving $\phi(x) = A\sin(kx)$. The condition at $x=L$ requires $\sin(kL) = 0$, so $kL=n\pi$, for which the modes are

\[\phi_n(x) = \sin\left(\frac{n\pi x}{L}\right), \quad n=1,2,3,\dots\]

The endpoint tokens have zero projection onto every mode. Semantically, this may be interpreted such that the end tokens do not carry any semantic variation, and all representational “richness” lives in the interior of the chain. This might apply to concepts where the extremes are absolute states, something like a scale from “dead” to “alive”, where the endpoints are definitionally fixed and the semantic nuance (dying, recovering, thriving…) lives in between.

Neumann boundary conditions

Neumann boundary conditions require the derivative to vanish at both endpoints: $\phi^\prime(0) = 0$ and $\phi^\prime(L)=0$. With the derivative of the general solution as $\phi^\prime(x) = Ak\cos(kx)-Bk\sin(kx)$, the condition at $x=0$ implies $A=0$ (assuming $k\neq0$). The remaining function must satisfy $\phi^\prime(L) = -Bk\sin(kL) = 0$, which again requires $kL=n\pi$. The modes are

\[\phi_n(x) = \cos \left(\frac{n\pi x}{L}\right), \quad n=0,1,2,\dots\]

Note that $n=0$ is now a valid mode that gives $\phi_0=\text{const}$, corresponding to a uniform displacement. Since PCA on centered data removes the mean, this constant mode is projected out, and the first PC corresponds to $\phi_1$. With $\theta = \pi x/L$, the first nontrivial mode is $\phi_1=\cos(\theta)$ and the second is $\phi_2=\cos(2\theta)$. Using $\cos(2\theta) = 2\cos^2(\theta)-1$, these satisfy the Chebyshev relation

\[\text{PC}_2 = 2\text{PC}_1^2-1,\]

which is a parabola in the $(\text{PC}_1, \text{PC}_2)$ plane. Neumann conditions correspond to open-ended ordinal scales, where the endpoints are not fixed in value, but the rate of semantic change flattens out. For example, “excellent” is not fundamentally different from “good”, but rather just further along the scale. Concepts that might be encoded using this boundary condition include levels of quality (terrible to excellent), levels of certainty (impossible to certain), or emotional valence (miserable to wonderful).

Robin boundary conditions

Robin boundary conditions interpolate between Dirichlet and Neumann. The condition $\alpha \phi + \beta \phi^\prime=0$ at each endpoint allows the mode to neither vanish nor have zero slope, but to satisfy a weighted combination of the two. Applying $x=0$ with the general solution gives $\alpha B + \beta Ak = 0$, fixing the ratio $A/B = -\alpha/(\beta k)$. The mode shape becomes

\[\phi(x)\propto \cos(kx) - \frac{\alpha}{\beta k}\sin(kx).\]

Applying the condition at $x=L$ then yields a transcendental equation for the allowed wavenumbers that must be solved numerically. For $\alpha/\beta \rightarrow 0$, the Neumann cosine modes are recovered; for $\alpha/\beta \rightarrow \infty$, the Dirichlet sine modes emerge. In PC space, the geometry interpolates smoothly between the Dirichlet and Neumann cases. This might arise for concepts where the endpoints have partial, but not total, semantic anchoring, meaning they exert some pull, but are not absolutely rigid.

2D Neumann boundary conditions

Many semantic concepts are not one-dimensional. When tokens vary along two independent semantic axes, the appropriate framework is a two-dimensional domain with Neumann conditions on all four edges: $\partial\phi/\partial x=0$ at $x=0,L_x$ and $\partial\phi/\partial y=0$ at $y=0,L_y$. The wave equation can be separated as $\phi(x,y) = X(x)Y(y)$, where each factor satisfies the one-dimensional Neumann problem. The modes are therefore products of cosines:

\[\phi_{mn}(x,y) =\cos\left(\frac{m\pi x}{L_x} \right)\cos\left(\frac{n\pi y}{L_y} \right), \quad m,n=0,1,2,\dots\]

The lowest nontrivial modes are $\phi_{10}$ (variation along $x$ only), $\phi_{01}$ (variation along $y$ only), and $\phi_{11}$ (variation along both).

First experiments

I tested this framework with a few boundary conditions at specific semantic concepts, and compared the theoretical prediction to the representations I found in GloVe and in Gemma 2B.

Experimental details

For GloVe, I used the 300-dimensional GloVe Common Crawl embeddings (400k vocabulary) loaded via gensim. Each word in a concept set is looked up directly and its 300-dimensional vector extracted. For Gemma 2B (18 layers), each word is placed in a short disambiguating sentence (e.g., In terms of quality, excellent is a typical example.) and the hidden-state activations at a specified layer are extracted by mean-pooling over the token positions that span the target word. Embeddings from either model are assembled into an $N \times D$ matrix, mean-centered across the $N$ words, and decomposed via SVD; the PCA scores are the columns of $U \cdot S$ (i.e., projections onto the leading principal directions).

Ordinal scales

I started with three ordinal concepts: levels of quality, levels of certainty, and emotional valence. These are examples of open-ended ordinal scales. The tokens have a clear linear ordering, but neither endpoint is pinned to an absolute, immovable value. The rate of semantic change flattens out at the extremes: the difference between “terrible” and “bad” feels smaller than the difference between “terrible” and “decent”. This is a signature of Neumann boundary conditions, where the derivative of the mode vanishes at the endpoints. The meaning does not stop abruptly, but it levels off.

The theoretical setup is rather straightforward. Given $N$ tokens placed at evenly spaced positions $x_i=(i-1)L/(N-1)$ along a chain of length $L$, the Neumann eigenmodes evaluated at those positions give the predicted scores for each token. As derived above, the first modes satisfy the Chebyshev relation $\text{PC}_2=2\text{PC}_1^2-1$ (regardless of $N$). Hence, theory predicts the tokens lie on a parabola in the $(\text{PC}_1, \text{PC}_2)$ plane.

The Neumann boundary condition applies to open-ended ordinal scales: concepts with a natural ordering where neither endpoint is semantically pinned to a fixed value. Theory predicts the embeddings to reside in a parabolic geometry.

For all three concepts, both in GloVe and Gemma, the embedding coordinates in the $(\phi_1, \phi_2)$ plane fall on this parabolic shape, in some cases with astonishingly good accuracy.

Testing the embeddings in GloVe and Gemma 2B against the theoretical predictions. The embeddings lie on the theoretically predicted parabola, despite some perturbations in terms of ordering. Heatmaps show pairwise cosine similarity in the top-6 principal component subspace of each word set.

The ordering along the parabola, however, does not always match the ranking we would intuitively assign. This is worth pausing on. The parabolic geometry is a consequence of the boundary condition alone and holds for any set of tokens governed by Neumann conditions, regardless of their spacing. The ordering, by contrast, depends on the positions $x_i$ along the chain, which are set by the model’s co-occurrence statistics rather than by any geometric constraint. When the ordering deviates from our expectation, the most natural explanation is either that near-synonyms like “good” and “fine” are ranked differently by the model’s internal statistics than by our intuition, or that the tokens do not form a clean one-dimensional chain in the model’s representation but participate in multiple overlapping semantic dimensions whose interference shifts their effective positions along the curve.

Logarithmic scales

I also tested concepts that vary over many orders of magnitude, such as storage capacity (byte to exabyte), temporal duration (second to century), and monetary value (cent to trillion). In contrast to the ordinal scales above, the assumption of uniform spacing breaks down here. The semantic distance between a kilobyte and a megabyte is not the same as between a megabyte and a megabyte-plus-one-kilobyte. What matters is the ratio between adjacent levels, not their difference, for which the meaning-carrying structure is logarithmic.

This changes the boundary conditions. Logarithmic scales are asymmetric by nature. The lower end is anchored at a definite smallest value: the scale of storage starts at a byte, monetary value at a cent, and so on. These are not fundamental limits, but they mark the point where each concept begins. The upper end, by contrast, is open-ended; there is no natural maximum to storage or money, so the scale is free there. This asymmetry calls for mixed boundary conditions: Dirichlet at the lower end with $\phi(0) =0$ and Neumann at the upper end with $\phi^\prime(L) = 0$. The resulting modes are

\[\phi_n(u) = \sin\left(\frac{n\pi}{2}u \right),\]

where $u$ is the normalized log-position

\[u = \frac{\log x - \log x_{\min}}{\log x_{\max} - \log x_{\min}},\]

which maps the range of physical values onto the interval $[0,1]$. The first mode completes exactly one quarter-cycle, the second a half-cycle, and so on. These are simple sinusoids in the logarithmic coordinate $u$ (though they appear as chirps when plotted against the physical variable $x$).

For logarithmic scales, atoms spaced uniformly in log-space produce qualitatively different mode shapes than linear spacing. Theory predicts the mode projections to follow sinusoidal chirp shapes when plotted against the normalized log-position u.

The analysis also differs slightly from the ordinal case in how the modes are extracted from the data. For ordinal scales, theory predicts the parabola to live in a specific two-dimensional plane in PC space, so the first two principal components directly reveal the geometry. For logarithmic scales, there is no reason the relevant mode should align with the first principal component. To test the prediction, we search over the top principal components to find the one whose values most closely follow a quarter-sine profile $A\sin\left(\tfrac{\pi u}{2} + \varphi\right) + b$, where amplitude, phase, and offset are fit freely. The plots show this best-matching principal component plotted against the normalized log-position $u$, with the theoretical curve overlaid.

Experimental validation of the chirp prediction for GloVe and Gemma 2B across storage, time, and monetary scales. The concepts lie on the predicted chirp curves, despite certain outliers. Shown is the best-fitting principal component projected against the normalized log-position $u$. Heatmaps show pairwise cosine similarity in the top-6 principal component subspace of each word set.

Testing these three concept sets in both GloVe and Gemma 2B, we find confirmation of the predicted mode shapes. In GloVe, the agreement is particularly clean for storage and money, with the tokens following the theoretical quarter-sine arc when plotted against their log-positions. Time is more ambiguous and could be assigned to either mode 1 or mode 2. In Gemma 2B, tokens generally conform better to the second mode, suggesting it carries more of the semantic signal for these concepts. This is not unexpected: the theory predicts a family of modes, and which one dominates in a given model depends on the eigenvalue spectrum of the co-occurrence matrix. The eigenvalues determine how much variance each mode captures, and their relative magnitudes are shaped by the training data, the tokenization, and the model architecture.

Outlook

These are early results, but very encouraging ones. Karkada et al. showed that the twelve months of the year are embedded as a near-perfect circle in the activations of Gemma 2B, a geometry that follows from periodic boundary conditions applied to a cyclic concept. The mathematics they used is just an excerpt from the broader frameworks developed in solid-state physics to study crystal vibrations. I asked myself whether this hints at a deeper connection beyond a single case, tested different boundary conditions on alternative semantic concepts, and found confirmation. Much more work is needed to establish how robust and general these patterns are, but the basic premise, that the geometry of word representations can be derived from physical principles applied to the semantic structure of a concept, appears to hold.

I find it noteworthy that just around the time of this writing, a paper appeared on the arXiv (accepted at ICLR 2026) titled “The Lattice Representation Hypothesis of Large Language Models”. The author shows that LLM embeddings encode not just individual concepts but the algebraic structure of a concept lattice, with operations like meet and join recoverable directly from the geometry. If this lattice structure is real, the phonon modes I proposed could just be the vibrational modes of exactly that lattice, with the boundary conditions set by the local topology of the concept graph. The lattice would provide the structure, and the phonon framework the dynamics. This is speculative, of course.

I am currently exploring how the phonon picture can be extended to additional boundary condition types, to higher-dimensional semantic structures, and how different modes might interfere with one another and what such interference would actually mean. The goal is to find a theoretically grounded description of how models represent meaning internally. My hope is that such a description could allow us to understand how concepts like morality are encoded and could be enforced, which is the best shot I see for solving alignment.