Chapter 5: Concept Subspaces, Projections, and Measurement Theory¶
Motivation¶
In Chapter 4, we saw how one state \(Q^+\) gives rise to multiple coordinate representations by projecting onto different subspaces (action slices, state slices).
But all those projections were pointwise—we projected onto individual basis elements \(k(z, \cdot)\).
Natural question: Can we project onto multi-dimensional subspaces discovered by spectral analysis?
Answer: Yes! And this gives us a rigorous framework for concepts in reinforcement learning.
This chapter develops:
- Concepts as invariant subspaces (not clusters)
- Projection operators for concept activation
- Measurement theory connecting to quantum mechanics
- Hierarchical composition via nested subspaces
- Practical algorithms for concept-driven learning
This formalizes Part II (Emergent Structure & Spectral Abstraction) of the GRL tutorial paper.
1. From Spectral Clustering to Concept Subspaces¶
The Problem with Clusters¶
Traditional clustering (k-means, hierarchical, spectral) produces:
- Discrete assignments: Point \(x\) belongs to cluster \(k\)
- Hard boundaries: Sharp transitions between clusters
- No smooth interpolation: Can't blend concepts
This doesn't match how concepts work in cognition or RL!
The GRL Approach: Functional Clustering¶
Section V of the original paper introduces spectral clustering in RKHS:
- Compute kernel matrix \(K_{ij} = k(z_i, z_j)\)
- Eigendecomposition: \(K = \Phi \Lambda \Phi^T\)
- Cluster eigenvectors by similarity
- Eigenmodes = "concepts"
But what does this mean mathematically?
Concepts as Subspaces¶
Key insight: Each cluster of eigenvectors defines a subspace in RKHS.
Formal definition:
Let \(\{\phi_{k,1}, \phi_{k,2}, \ldots, \phi_{k,m_k}\}\) be eigenvectors in cluster \(k\).
Concept \(k\) is the subspace:
Properties:
- \(\mathcal{C}_k\) is a linear subspace of the RKHS
- Dimension: \(\dim(\mathcal{C}_k) = m_k\)
- Orthogonal decomposition: \(\mathcal{H}_k = \bigoplus_k \mathcal{C}_k \oplus \mathcal{C}_{\perp}\)
Why Subspaces, Not Clusters?¶
Subspaces give you:
- Smooth activation: Degree of membership, not binary
- Compositionality: Combine multiple concepts
- Interpolation: Blend between concepts
- Hierarchy: Nested subspaces = hierarchical concepts
- Operators: Well-defined projection and measurement
Clusters only give you: Hard assignments.
2. Projection Operators¶
Definition¶
For concept subspace \(\mathcal{C}_k\) with orthonormal basis \(\{\phi_{k,i}\}_{i=1}^{m_k}\):
Projection operator:
For the reinforcement field:
Properties¶
1. Idempotence:
(Projecting twice = projecting once)
2. Orthogonality:
(Different concepts are orthogonal)
3. Completeness:
(Concepts span the full space)
4. Self-adjoint:
(Symmetric inner product)
These are exactly the properties of quantum mechanical projection operators!
Visual Intuition¶
In 3D, projecting onto a plane:
\(P_k Q^+\) is the "shadow" of \(Q^+\) on the concept subspace.
Computational Form¶
Given:
- Reinforcement field: \(Q^+ = \sum_i w_i k(z_i, \cdot)\)
- Concept basis: \(\{\phi_{k,1}, \ldots, \phi_{k,m_k}\}\)
Compute projection:
In matrix form: \(\mathbf{c}_k = \mathbf{K} \mathbf{w}\) where:
- \(K_{ji} = \langle k(z_i, \cdot), \phi_{k,j} \rangle\)
- \(\mathbf{w} = [w_1, \ldots, w_N]^T\)
3. Concept Activation: Observables¶
The Measurement Question¶
Given field \(Q^+\) and concept \(k\), we want to measure:
"How strongly does the current field activate this concept?"
This is exactly the quantum measurement problem!
Concept Activation Observable¶
Definition:
Expanded form:
Interpretation: Sum of squared projections onto concept basis vectors.
Properties¶
1. Non-negativity:
2. Boundedness:
(Total activation bounded by field strength)
3. Normalized activation:
gives a probability-like distribution over concepts.
4. Continuity:
\(A_k\) varies smoothly as \(Q^+\) evolves—no discrete jumps!
Connection to Quantum Mechanics¶
| Quantum Mechanics | GRL Concepts |
|---|---|
| Observable \(\hat{O}\) | Projection operator \(P_k\) |
| Eigenspace \(\mathcal{H}_\lambda\) | Concept subspace \(\mathcal{C}_k\) |
| State \(\|\psi\rangle\) | Field \(Q^+\) |
| Measurement outcome | Concept activation \(A_k\) |
| Born rule: \(p = \|\langle \lambda \| \psi \rangle\|^2\) | Activation: \(A_k = \|P_k Q^+\|^2\) |
This is not an analogy—it's the same mathematical structure!
4. Concept-Conditioned Representations¶
Projected Field Values¶
Standard field evaluation:
Concept-conditioned evaluation:
Interpretation: Value at \(z\) according to concept \(k\) only.
Concept-Conditioned Policy¶
Standard policy:
Concept-conditioned policy:
Use case: Different concepts induce different policies - Concept "explore" → high entropy policy - Concept "exploit" → peaked policy - Concept "avoid" → negative values
Action Wavefunction per Concept¶
From Chapter 4, we had action wavefunction \(\psi_s(a) = Q^+(s, a)\).
Concept-specific action wavefunction:
Interpretation: Action landscape at state \(s\) according to concept \(k\).
Visual:
Full field: ψ_s(a)
***
*****
Concept 1: ψ_{s,1}(a)
*
***
(sharp peak)
Concept 2: ψ_{s,2}(a)
*****
*******
(broad support)
Different concepts emphasize different action modes!
5. Hierarchical Composition¶
Nested Subspaces¶
Key idea: Concepts can be hierarchical via nested subspaces.
Example:
Interpretation:
- \(\mathcal{C}_1\): "Locomotion" (coarse)
- \(\mathcal{C}_{1,1}\): "Forward motion" (medium)
- \(\mathcal{C}_{1,1,1}\): "Running" (fine)
Hierarchical Projection¶
Level 1 (coarse):
Level 2 (medium):
Level 3 (fine):
Note: Because of nesting, \(P_{1,1,1} P_{1,1} P_1 = P_{1,1,1}\).
Activation Hierarchy¶
Coarse activation:
Medium activation (given coarse):
Fine activation (given medium):
Interpretation: Conditional activations down the hierarchy.
Compositional Activation Tree¶
Q⁺
|
┌────────┼────────┐
│ │ │
P₁ P₂ P₃
0.6 0.3 0.1
|
┌───┼───┐
│ │ │
P₁,₁ P₁,₂ P₁,₃
0.4 0.3 0.3
|
┌─┼─┐
P₁,₁,₁ P₁,₁,₂
0.7 0.3
Reading:
- 60% activation in concept 1
- Within concept 1, 40% in sub-concept 1.1
- Within 1.1, 70% in sub-sub-concept 1.1.1
This is a continuous hierarchy, not a discrete tree!
6. Spectral Discovery of Concepts¶
Algorithm: Concept Subspace Extraction¶
Input: Kernel matrix \(K \in \mathbb{R}^{N \times N}\), number of concepts \(C\)
Step 1: Eigendecomposition
where \(\Lambda = \text{diag}(\lambda_1, \ldots, \lambda_N)\) with \(\lambda_1 \geq \lambda_2 \geq \cdots\).
Step 2: Select Top Eigenvectors
Keep top \(M\) eigenvectors: \(\{\phi_1, \ldots, \phi_M\}\) (e.g., \(M = 50\)).
Step 3: Cluster Eigenvectors
Apply k-means (or other clustering) to eigenvector matrix \(\Phi_M \in \mathbb{R}^{N \times M}\) to get \(C\) clusters.
Step 4: Define Concept Subspaces
For cluster \(k\) containing eigenvectors \(\{\phi_{i_1}, \ldots, \phi_{i_{m_k}}\}\):
Output: Concept subspaces \(\{\mathcal{C}_1, \ldots, \mathcal{C}_C\}\)
Why This Works¶
Intuition: Eigenvectors of the kernel matrix capture modes of variation in the augmented space.
Eigenvectors with similar profiles → related functional patterns → same concept.
Mathematical justification:
- Kernel PCA identifies principal components
- Clustering groups related components
- Subspaces capture multi-dimensional concepts
Adaptive Concept Discovery¶
Instead of fixed \(C\), use adaptive methods:
Gap heuristic: Choose \(C\) where eigenvalue gap is large:
Information criterion: Minimize BIC or AIC for cluster count.
Stability: Use consensus clustering across multiple runs.
7. Concept Dynamics and Evolution¶
Temporal Concept Activation¶
As the agent learns, \(Q^+(t)\) evolves, so concept activation changes:
This gives interpretable learning curves:
Activation
^
| Concept 1 (exploration)
| ────╮
| ╰─────────────────
|
| Concept 2 (exploitation)
| ╭─────────────
| ──────╯
+-----------------------> time
"Agent transitioned from exploratory to exploitative concept."
Concept Transition Matrix¶
Define concept dominance: \(c(t) = \arg\max_k A_k(t)\)
Transition matrix: \(T_{k\ell} = P(c(t+1) = \ell | c(t) = k)\)
This reveals concept dynamics without discrete states!
Concept Persistence¶
Measure stability: How long does a concept remain dominant?
Persistent concepts = stable strategies
Transient concepts = exploratory phases
8. Practical Algorithms¶
Algorithm 1: Concept-Conditioned Policy¶
Input: State \(s\), concept weights \(\{\alpha_k\}\)
Step 1: Compute concept-conditioned fields:
Step 2: Weighted combination:
Step 3: Policy:
Use case: Mix exploration and exploitation by adjusting \(\alpha_k\).
Algorithm 2: Hierarchical Action Selection¶
Input: State \(s\), hierarchy depth \(D\)
Level 1: Select coarse concept:
Level 2: Select medium concept within \(c_1\):
...
Level D: Select action using fine concept:
Benefit: Hierarchical decision-making with interpretable intermediate choices.
Algorithm 3: Concept-Based Transfer¶
Source task:
- Learn concept subspaces \(\{\mathcal{C}_k^{\text{source}}\}\)
- Store projection operators \(\{P_k^{\text{source}}\}\)
Target task:
- Initialize field: \(Q^+_{\text{target}} = 0\)
-
For each experience \((s, a, r)\):
-
Project particle onto source concepts: $\(z_{\text{concept}} = \sum_k P_k^{\text{source}} k((s,a), \cdot)\)$
- Update field using projected basis
Why this works: Concepts capture abstract structure that transfers across tasks.
9. Connection to Existing Work¶
Eigenoptions (Machado et al., 2017)¶
Eigenoptions: Use eigenvectors of state transition graph as options (skills).
GRL concepts: Use eigenvectors of kernel matrix as concept subspaces.
Similarities:
- Both use spectral methods
- Both identify "natural" structures
Differences:
- Eigenoptions: single eigenvector = one option (discrete)
- GRL concepts: subspace of eigenvectors = one concept (continuous)
- Eigenoptions: hard assignment
- GRL concepts: soft activation
GRL generalizes eigenoptions to continuous, compositional representations.
Successor Features (Barreto et al., 2017)¶
Successor features: Represent value function as inner product \(V(s) = \langle \psi(s), w \rangle\).
GRL concepts: Represent field as projection \(Q^+ = \sum_k P_k Q^+\).
Similarities:
- Both use linear combinations
- Both enable transfer
Differences:
- Successor features: fixed basis (state features)
- GRL concepts: learned basis (eigenvectors)
- Successor features: flat structure
- GRL concepts: hierarchical structure
Affordances (Gibson, 1979; Khetarpal et al., 2020)¶
Affordance: What actions are possible in a state?
GRL state wavefunction \(\phi_a(s)\) (from Chapter 4) is a learned affordance map!
Concept-conditioned affordances:
shows affordances from perspective of concept \(k\).
10. Implications for Part II of Tutorial Paper¶
Current Status of Section V¶
Original paper Section V:
- Introduces spectral clustering idea
- Shows empirical results
- Demonstrates emergent concepts
What's missing:
- Formal definition of concepts (beyond "clusters")
- Operational semantics (what do you do with concepts?)
- Connection to learning algorithms
What This Chapter Provides¶
Formalization:
- Concepts = subspaces \(\mathcal{C}_k \subset \mathcal{H}_k\)
- Activation = observable \(A_k = \|P_k Q^+\|^2\)
- Hierarchy = nested subspaces
Operations:
- Project field onto concepts: \(P_k Q^+\)
- Condition policy on concepts: \(\pi_k(a|s)\)
- Compose concepts hierarchically
Algorithms:
- Concept discovery (spectral + clustering)
- Concept-conditioned learning
- Hierarchical transfer
Structure for Extended Section V¶
Proposed outline:
V-A. Motivation
- Why functional clustering?
- Limitations of discrete concepts
V-B. Concept Subspaces
- Definition via eigenspaces
- Projection operators
- This chapter's formalism
V-C. Spectral Discovery
- Algorithm
- Adaptive selection
- Stability analysis
V-D. Concept Dynamics
- Activation evolution
- Transition patterns
- Persistence measures
V-E. Hierarchical Composition
- Nested subspaces
- Multi-level activation
- Compositional policies
V-F. Empirical Results
- Concept discovery in benchmark tasks
- Activation curves
- Transfer experiments
V-G. Connections
- Eigenoptions, successor features, affordances
- Relation to hierarchical RL literature
Summary¶
Key Contributions of This Chapter¶
-
Concepts as Subspaces
-
Not clusters, but linear subspaces \(\mathcal{C}_k \subset \mathcal{H}_k\)
-
Enables smooth, compositional representations
-
Projection Operators
-
Formal definition: \(P_k: \mathcal{H}_k \to \mathcal{C}_k\)
- Properties: idempotent, orthogonal, complete
-
Connection to QM measurement theory
-
Concept Activation Observables
-
Measure: \(A_k = \|P_k Q^+\|^2\)
- Smooth evolution, no discrete jumps
-
Interpretable learning curves
-
Hierarchical Composition
-
Nested subspaces: \(\mathcal{C}_1 \supset \mathcal{C}_{1,1} \supset \cdots\)
- Conditional activation at each level
-
Natural multi-scale representation
-
Practical Algorithms
-
Spectral discovery
- Concept-conditioned policies
- Hierarchical action selection
- Transfer learning
Key Equations¶
Projection operator:
Concept activation:
Concept-conditioned field:
Hierarchical activation:
What This Enables¶
Theoretical:
- Rigorous concept formalism
- Quantum measurement theory connection
- Hierarchical composition framework
Practical:
- Interpretable learning (activation curves)
- Hierarchical policies (multi-level decisions)
- Transfer learning (concept basis)
- Compositional strategies (concept mixing)
For Part II:
- Mathematical foundation for Section V
- Operational algorithms
- Clear connection to QM
Further Reading¶
Within This Series¶
- Chapter 1a: State Vector vs. Wavefunction
- Chapter 2: RKHS Basis and Amplitudes
- Chapter 4: Action and State Projections
GRL Tutorials¶
- Tutorial Chapter 2: RKHS Foundations
- Tutorial Chapter 4: Reinforcement Field
- Part II (planned): Emergent Structure & Spectral Abstraction
Related Literature¶
Spectral Methods in RL:
- Machado et al. (2017). "A Laplacian Framework for Option Discovery in Reinforcement Learning." ICML.
- Mahadevan & Maggioni (2007). "Proto-value Functions: A Laplacian Framework." ICML.
Hierarchical RL:
- Sutton et al. (1999). "Between MDPs and semi-MDPs: A Framework for Temporal Abstraction."
- Bacon et al. (2017). "The Option-Critic Architecture." AAAI.
Transfer Learning:
- Barreto et al. (2017). "Successor Features for Transfer in Reinforcement Learning." NIPS.
- Taylor & Stone (2009). "Transfer Learning for Reinforcement Learning Domains: A Survey."
Quantum Measurement Theory:
- von Neumann, J. (1932). Mathematical Foundations of Quantum Mechanics.
- Peres, A. (1993). Quantum Theory: Concepts and Methods. Kluwer.
Functional Data Analysis:
- Ramsay & Silverman (2005). Functional Data Analysis. Springer.
Next Steps¶
For Research:
- Implement spectral concept discovery algorithm
- Test concept-conditioned policies on benchmarks
- Develop hierarchical composition framework
- Apply to transfer learning problems
For Tutorial Paper (Part II):
- Integrate this formalism into Section V
- Add concept activation visualizations
- Show hierarchical decomposition examples
- Connect to experimental results
For Extensions (Papers A/B/C):
- Paper B: "Hierarchical Reinforcement Learning via Concept Subspace Projections"
- Paper C: "Transfer Learning with Functional Concept Bases"
Last Updated: January 14, 2026