Complex-Valued RKHS and Interference Effects¶

Introduction¶

The previous document established that GRL's reinforcement field shares deep structural similarities with quantum mechanical wavefunctions. This document explores what happens when we extend GRL to complex-valued reproducing kernel Hilbert spaces.

This is where the analogy becomes mathematically literal—and where entirely new learning dynamics become possible.

Motivation: Why Go Complex?¶

Limitations of Real-Valued RKHS¶

In standard GRL, the reinforcement field is real-valued:

\[Q^+(z) \in \mathbb{R}\]

This means: - Particles can only reinforce or cancel each other (positive/negative weights) - There is no notion of phase - Interference is limited to constructive/destructive along a single axis

What Complex Numbers Enable¶

In quantum mechanics, amplitudes are complex:

\[\psi(x) \in \mathbb{C}\]

Probability is $|\psi(x)|^2 = \psi^*(x) \cdot \psi(x)$, where $\psi^*$ is the complex conjugate.

Complex amplitudes enable: - Phase relationships: Two particles can have the same magnitude but different phases - Rich interference: Constructive, destructive, and partial interference depending on phase alignment - Rotation in state space: Phase evolution provides natural dynamics - Richer representations: Multi-modal distributions with phase structure

Complex-Valued RKHS: Mathematical Foundation¶

Definition¶

A complex RKHS is a Hilbert space $\mathcal{H}$ over the complex field $\mathbb{C}$, consisting of functions $f: \mathcal{X} \to \mathbb{C}$, equipped with a reproducing kernel $k: \mathcal{X} \times \mathcal{X} \to \mathbb{C}$ satisfying:

Hermitian symmetry: $k(x, y) = \overline{k(y, x)}$ (complex conjugate)
Positive semi-definiteness: For any $\{x_i\}$ and $\{c_i \in \mathbb{C}\}$: $$\sum_{i,j} \bar{c}_i \, c_j \, k(x_i, x_j) \geq 0$$
Reproducing property: $f(x) = \langle f, k(x, \cdot) \rangle_{\mathcal{H}}$

The inner product is now sesquilinear (conjugate-linear in first argument):

\[\langle f, g \rangle = \int \overline{f(x)} \, g(x) \, dx\]

Example: Complex Gaussian Kernel¶

The Gaussian kernel can be augmented with a phase factor:

\[k_{\mathbb{C}}(x, y) = \exp\left(-\frac{\|x-y\|^2}{2\sigma^2}\right) \cdot e^{i\phi(x,y)}\]

where $\phi(x,y)$ encodes directional or temporal relationships.

Properties: - Magnitude decreases with distance (as usual) - Phase encodes additional structure - Hermitian: $k_{\mathbb{C}}(x, y) = \overline{k_{\mathbb{C}}(y, x)}$

Complex GRL: The Extended Framework¶

1. Complex Experience Particles¶

Each particle carries a position, magnitude, and phase:

\[(z_i, w_i, \phi_i)\]

where: - $z_i = (s_i, \theta_i)$: augmented state-action location - $w_i \in \mathbb{R}^+$: magnitude (importance) - $\phi_i \in [0, 2\pi)$: phase angle

The complex weight is:

\[c_i = w_i \cdot e^{i\phi_i}\]

2. Complex Reinforcement Field¶

The reinforcement field becomes complex-valued:

\[\Psi(z) = \sum_i c_i \, k(z_i, z) \in \mathbb{C}\]

This is now literally a wavefunction over augmented state-action space.

3. Energy from Squared Magnitude¶

The value (or negative energy) at a point is the squared magnitude:

\[V(z) = |\Psi(z)|^2 = \Psi^*(z) \cdot \Psi(z)\]

This is the Born rule applied to reinforcement learning.

Interference Effects¶

Constructive Interference¶

Two particles at $z_1, z_2$ with aligned phases ($\phi_1 \approx \phi_2$):

\[|\Psi(z)|^2 = |c_1 k(z_1, z) + c_2 k(z_2, z)|^2 \approx (|c_1| + |c_2|)^2 k(z_1, z)^2\]

The combined effect is greater than the sum of individual contributions.

Use case: Reinforce confidence when multiple experiences agree in phase (temporal coherence).

Destructive Interference¶

Two particles with opposite phases ($\phi_1 = \phi_2 + \pi$):

\[|\Psi(z)|^2 \approx (|c_1| - |c_2|)^2 k(z_1, z)^2\]

The particles partially or fully cancel each other.

Use case: Suppress value when experiences disagree in temporal or contextual phase.

Partial Interference¶

For general phase difference $\Delta\phi = \phi_2 - \phi_1$:

\[|\Psi(z)|^2 = |c_1|^2 + |c_2|^2 + 2|c_1||c_2| \cos(\Delta\phi) \, k(z_1, z) k(z_2, z)\]

The interference term $\cos(\Delta\phi)$ modulates the interaction: - $\Delta\phi = 0$: Constructive ($\cos = +1$) - $\Delta\phi = \pi/2$: No interference ($\cos = 0$) - $\Delta\phi = \pi$: Destructive ($\cos = -1$)

Phase Semantics: What Does Phase Represent?¶

1. Temporal Phase¶

Idea: Encode when an experience occurred.

\[\phi_i = \omega \cdot t_i\]

where $t_i$ is the time of experience $i$, and $\omega$ is a frequency.

Effect: Recent experiences (similar $t$) interfere constructively. Old experiences may cancel new ones if out of phase.

Use case: Temporal credit assignment, recency weighting.

2. Contextual Phase¶

Idea: Encode context (e.g., episode ID, environment variant).

\[\phi_i = 2\pi \cdot \frac{\text{context}_i}{\text{num\_contexts}}\]

Effect: Experiences from the same context interfere constructively; different contexts may interfere destructively.

Use case: Multi-task learning, domain adaptation.

3. Directional Phase¶

Idea: Encode directionality in state-action space.

\[\phi(x, y) = \text{angle}(y - x)\]

Effect: Particles pointing in similar directions reinforce; opposite directions cancel.

Use case: Vector field learning, flow-based control.

4. Learned Phase¶

Idea: Let a neural network predict phase:

\[\phi_i = \text{NN}_\phi(z_i, \text{context})\]

Effect: The network learns what phase relationships are useful.

Use case: Discover latent temporal or contextual structure.

Complex Spectral Clustering¶

Motivation¶

Part II (Emergent Structure & Spectral Abstraction) uses spectral clustering on the kernel matrix to discover concepts. With complex kernels, this becomes even more powerful.

Complex Kernel Matrix¶

The Gram matrix is now complex Hermitian:

\[K_{ij} = k_{\mathbb{C}}(z_i, z_j) \in \mathbb{C}\]

Properties: - $K^\dagger = K$ (Hermitian) - Eigenvalues are real - Eigenvectors are complex

Complex Eigenmodes as Concepts¶

Eigendecomposition:

\[K = V \Lambda V^\dagger\]

where $V$ contains complex eigenvectors.

Interpretation: - Each eigenvector $v_j$ is a concept in function space - Magnitude $|v_{ij}|$ indicates particle $i$'s contribution to concept $j$ - Phase $\arg(v_{ij})$ indicates particle $i$'s phase alignment within concept $j$

Result: Concepts can have internal phase structure, enabling richer hierarchical organization.

Implementation Sketch¶

Complex Particle Memory¶

class ComplexParticleMemory:
    def __init__(self):
        self.positions = []     # z_i: (state, action_params)
        self.magnitudes = []    # w_i: real weights
        self.phases = []        # φ_i: angles in [0, 2π)

    def add(self, z, w, phi):
        self.positions.append(z)
        self.magnitudes.append(w)
        self.phases.append(phi)

    def complex_weights(self):
        return [w * np.exp(1j * phi) 
                for w, phi in zip(self.magnitudes, self.phases)]

Complex Reinforcement Field¶

def complex_field(z_query, memory, kernel):
    """
    Ψ(z) = Σ_i c_i k(z_i, z)
    """
    psi = 0.0 + 0.0j  # Complex accumulator
    for z_i, c_i in zip(memory.positions, memory.complex_weights()):
        psi += c_i * kernel(z_i, z_query)
    return psi

def value(z_query, memory, kernel):
    """
    V(z) = |Ψ(z)|²
    """
    psi = complex_field(z_query, memory, kernel)
    return np.abs(psi)**2

Complex Kernel¶

def complex_gaussian_kernel(x, y, sigma=1.0, phase_fn=None):
    """
    k(x,y) = exp(-||x-y||²/2σ²) · exp(iφ(x,y))
    """
    dist_sq = np.sum((x - y)**2)
    magnitude = np.exp(-dist_sq / (2 * sigma**2))

    if phase_fn is None:
        phase = 0.0  # Real-valued by default
    else:
        phase = phase_fn(x, y)

    return magnitude * np.exp(1j * phase)

Potential Applications¶

1. Temporal Credit Assignment¶

Problem: Delayed rewards make credit assignment hard.

Complex GRL Solution: - Encode time in phase: $\phi_i = \omega \cdot t_i$ - Recent experiences interfere constructively - Automatic temporal discounting via phase decoherence

2. Multi-Task Learning¶

Problem: Experiences from different tasks may conflict.

Complex GRL Solution: - Encode task ID in phase - Same-task experiences reinforce - Cross-task experiences may cancel (if interference is destructive)

3. Directional Value Fields¶

Problem: In navigation, direction matters.

Complex GRL Solution: - Encode movement direction in phase - Forward-consistent trajectories reinforce - Conflicting directions cancel

4. Concept Discovery with Phase Structure¶

Problem: Concepts may have temporal or contextual organization.

Complex GRL Solution: - Complex spectral clustering reveals concepts with phase structure - Hierarchical concepts organized by magnitude and phase

Challenges and Open Questions¶

Implementation Challenges¶

Complex neural networks: Most ML frameworks assume real-valued parameters
Solution: Use complex-valued layers (Trabelsi et al., 2018) or split real/imaginary
Interpretability: Complex weights are harder to visualize
Solution: Visualize magnitude and phase separately
Computational cost: Complex arithmetic is more expensive
Solution: Implement in efficient backends (JAX, PyTorch with complex dtypes)

Theoretical Questions¶

What phase functions are useful? Requires empirical investigation
How to initialize phases? Random, learned, or hand-designed?
When does complex RKHS outperform real RKHS? Depends on problem structure

Connection to Quantum Machine Learning¶

This framework connects GRL to the emerging field of quantum kernel methods:

Havlíček et al. (2019): Quantum feature maps naturally produce complex kernels
Schuld & Killoran (2019): Quantum models as complex RKHS
Lloyd et al. (2020): Quantum algorithms for spectral clustering

Key insight: Even without quantum hardware, complex-valued RKHS can capture structure that real-valued methods miss.

Summary¶

Aspect	Real RKHS (Part I)	Complex RKHS (This Doc)
Field	$Q^+(z) \in \mathbb{R}$	$\Psi(z) \in \mathbb{C}$
Particles	$(z_i, w_i)$	$(z_i, w_i, \phi_i)$
Interference	Additive only	Constructive/destructive/partial
Phase	None	Temporal, contextual, directional
Spectral	Real eigenmodes	Complex eigenmodes with phase

Key Takeaway: Complex-valued RKHS is a natural extension of GRL that enables richer dynamics, phase-based reasoning, and structured concept discovery.

Next Steps¶

For Researchers¶

Implement complex particle memory and kernels
Test on problems with temporal or contextual structure
Compare complex vs. real spectral clustering

For Theorists¶

Prove convergence guarantees for complex RF-SARSA
Characterize when complex kernels outperform real kernels
Connect to quantum information theory

For Part II (Spectral Abstraction)¶

Apply complex spectral clustering to concept discovery
Investigate phase structure within concepts
Develop hierarchical methods leveraging phase

References¶

Complex-Valued Neural Networks: - Trabelsi et al. (2018). Deep complex networks. ICLR. - Hirose (2012). Complex-valued neural networks: Advances and applications. Wiley.

Quantum Kernel Methods: - Havlíček et al. (2019). Supervised learning with quantum-enhanced feature spaces. Nature 567, 209-212. - Schuld & Killoran (2019). Quantum machine learning in feature Hilbert spaces. Physical Review Letters 122, 040504. - Lloyd et al. (2020). Quantum embeddings for machine learning. arXiv:2001.03622.

RKHS Theory: - Steinwart & Christmann (2008). Support Vector Machines. Springer (Chapter on complex RKHS). - Schölkopf & Smola (2002). Learning with Kernels. MIT Press.

GRL Original Paper: - Chiu & Huber (2022). Generalized Reinforcement Learning. arXiv:2208.04822

Last Updated: January 12, 2026

Aspect	Real RKHS (Part I)	Complex RKHS (This Doc)
Field	\(Q^+(z) \in \mathbb{R}\)	\(\Psi(z) \in \mathbb{C}\)
Particles	\((z_i, w_i)\)	\((z_i, w_i, \phi_i)\)
Interference	Additive only	Constructive/destructive/partial
Phase	None	Temporal, contextual, directional
Spectral	Real eigenmodes	Complex eigenmodes with phase