Paired Delta Prediction: Siamese Architecture¶
Status: Tested (r=0.38)
Last Updated: December 2025
Overview¶
Paired Delta Prediction uses a Siamese architecture to predict splice site score changes by comparing reference and alternate sequences through shared encoder weights.
Architecture¶
Paired Delta Predictor (Siamese)
┌─────────────────────────────────────────────────────────────────────────┐
│ │
│ ref_seq [B, 4, L] alt_seq [B, 4, L] │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────┐ ┌─────────────┐ │
│ │ Encoder │ ◄── shared weights ──► │ Encoder │ │
│ │ (Gated CNN)│ │ (Gated CNN)│ │
│ └─────────────┘ └─────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ref_emb [B, D] alt_emb [B, D] │
│ │ │ │
│ └──────────────┬───────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────┐ │
│ │ diff = alt - ref │ │
│ └──────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────┐ │
│ │ Delta Head │ │
│ │ (MLP + Output) │ │
│ └──────────────────┘ │
│ │ │
│ ▼ │
│ Δ [B, L, 2] │
│ (Δ_donor, Δ_acceptor) │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Key Characteristics¶
Inputs¶
- ref_seq: Reference sequence [B, 4, L] (one-hot encoded)
- alt_seq: Alternate sequence [B, 4, L] (one-hot encoded)
Target¶
- base_delta:
base_model(alt) - base_model(ref)
Output¶
- Delta scores: [B, L, 2] (per-position Δ_donor, Δ_acceptor)
Results¶
Training Variations¶
| Variation | Correlation | Notes |
|---|---|---|
| V2 Original | r=-0.04 | No learning |
| V2 + 10x data | r=0.002 | Still no correlation |
| Gated CNN | r=0.36 | Architecture matters! |
| + Quantile loss (τ=0.9) | r=0.38 | Best for paired |
| + Scaling | r=0.22 | Overfitting |
| + Temperature | r=-0.03 | No improvement |
| + Multi-task | r=-0.07 | Task interference |
Best Configuration¶
from agentic_spliceai.splice_engine.meta_layer.models import (
SimpleCNNDeltaPredictor,
create_calibrated_predictor
)
# Best architecture
base_model = SimpleCNNDeltaPredictor(
hidden_dim=64,
n_layers=6,
kernel_size=15
)
# Best calibration: Quantile loss
model = create_calibrated_predictor(
base_predictor=base_model,
strategy='quantile',
quantile=0.9 # Focus on large deltas
)
Limitations¶
1. Target Quality Issue¶
The fundamental problem:
If variant is NOT splice-altering but base model predicts a delta:
→ We're training on WRONG targets
→ Model learns noise, not signal
2. Inference Overhead¶
Requires two forward passes: 1. Encode reference sequence 2. Encode alternate sequence 3. Compute difference
This is slower than single-pass approaches.
3. Correlation Ceiling¶
Even with best architecture (Gated CNN) and loss (Quantile), correlation is limited to r=0.38. This suggests the target (base model delta) is not reliable enough.
When to Use¶
✅ Use when: - You trust base model predictions - Reference sequence is always available - Inference speed is not critical
❌ Don't use when: - Base model has known blind spots (most variants!) - Single-pass efficiency is needed - You have SpliceVarDB labels (use Validated Delta instead)
Comparison to Validated Delta¶
| Aspect | Paired (this) | Validated |
|---|---|---|
| Input | ref + alt | alt + var_info |
| Target | base_delta | SpliceVarDB-validated |
| Forward passes | 2 | 1 |
| Correlation | r=0.38 | r=0.41 |
| Target quality | Variable | Ground truth |
Model Files¶
| File | Description |
|---|---|
models/delta_predictor.py |
Original Siamese implementation |
models/delta_predictor_v2.py |
Per-position output version |
models/hyenadna_delta_predictor.py |
SimpleCNNDeltaPredictor |
models/delta_predictor_calibrated.py |
Calibration wrappers |
Lessons Learned¶
- Architecture matters: Gated CNN >> simple CNN
- Loss function matters: Quantile loss >> MSE for sparse deltas
- Target quality is limiting factor: Can't exceed base model accuracy
- Consider single-pass alternatives when SpliceVarDB labels available
For better results, consider using Validated Delta Prediction instead.