Meta-Layer Experiments¶
This directory contains documentation for experiments conducted on the meta-layer architecture for splice site prediction.
Experiment Index¶
| ID | Name | Status | Outcome | Date |
|---|---|---|---|---|
| 001 | Canonical Classification | ✅ Completed | Partial Success | Dec 2025 |
| 002 | Paired Delta Prediction | ✅ Completed | r=0.38 (insufficient) | Dec 2025 |
| 003 | Binary Classification (Multi-Step Step 1) | ✅ Completed | AUC=0.61, F1=0.53 | Dec 2025 |
| 004 | Validated Delta (Single-Pass) | ✅ Completed | r=0.41 (best!) | Dec 2025 |
Experiment Categories¶
Classification-Based Approaches¶
- 001_canonical_classification: Train on GTF labels, evaluate on SpliceVarDB (FAILED for variants)
- 003_binary_classification: Multi-Step Step 1 - "Is this variant splice-altering?"
Delta-Based Approaches¶
- 002_delta_prediction: Paired (Siamese) prediction (r=0.38)
- 004_validated_delta: Single-pass with validated targets (r=0.41) - BEST
Directory Structure¶
Each experiment follows this structure:
NNN_experiment_name/
├── README.md # Overview, hypothesis, setup, results summary
├── RESULTS.md # Detailed numerical results (optional)
├── ANALYSIS.md # In-depth analysis (optional)
├── LESSONS_LEARNED.md # Key insights and recommendations (optional)
└── (optional)
├── config.yaml # Experiment configuration
└── figures/ # Plots and visualizations
Key Metrics¶
For Classification Experiments¶
- Accuracy: Overall classification accuracy
- AP (Average Precision): Per-class ranking quality
- PR-AUC: Area under precision-recall curve
For Delta Prediction Experiments¶
- Pearson r: Correlation with true deltas
- Detection Rate: % of splice-altering variants detected
- Mean |Δ|: Average absolute delta score
Quick Reference¶
Current Best Results¶
| Task | Best Model | Metric | Value |
|---|---|---|---|
| Classification | Meta-Layer (001) | Accuracy | 99.11% |
| Variant Detection | Validated Delta (004) | Correlation | r=0.41 |
| Binary Classification | Multi-Step (003) | AUC | 0.61 |
Key Findings¶
- Classification ≠ Detection: High classification accuracy doesn't translate to variant detection
- Training objective matters: Must train for the evaluation task
- Target quality matters: Learning from potentially wrong base model deltas limits paired prediction
- Validated targets work better: SpliceVarDB filtering improves correlation from r=0.38 to r=0.41
- Binary classification is learnable: AUC=0.61 > random, but F1=0.53 needs improvement (>0.7)
How to Add a New Experiment¶
- Create directory:
NNN_experiment_name/ - Copy template from existing experiment
- Update
README.mdwith hypothesis and setup - Run experiment, record results
- Analyze and document insights
- Update this index
Related Documentation¶
- ARCHITECTURE.md - Meta-layer architecture
LABELING_STRATEGY.md- Labeling approaches (planned)- methods/ - Methodology documentation