Meta-Layer Experiments¶

This directory contains documentation for experiments conducted on the meta-layer architecture for splice site prediction.

Experiment Index¶

ID	Name	Status	Outcome	Date
001	Canonical Classification	✅ Completed	Partial Success	Dec 2025
002	Paired Delta Prediction	✅ Completed	r=0.38 (insufficient)	Dec 2025
003	Binary Classification (Multi-Step Step 1)	✅ Completed	AUC=0.61, F1=0.53	Dec 2025
004	Validated Delta (Single-Pass)	✅ Completed	r=0.41 (best!)	Dec 2025

Experiment Categories¶

Classification-Based Approaches¶

001_canonical_classification: Train on GTF labels, evaluate on SpliceVarDB (FAILED for variants)
003_binary_classification: Multi-Step Step 1 - "Is this variant splice-altering?"

Delta-Based Approaches¶

002_delta_prediction: Paired (Siamese) prediction (r=0.38)
004_validated_delta: Single-pass with validated targets (r=0.41) - BEST

Directory Structure¶

Each experiment follows this structure:

NNN_experiment_name/
├── README.md           # Overview, hypothesis, setup, results summary
├── RESULTS.md          # Detailed numerical results (optional)
├── ANALYSIS.md         # In-depth analysis (optional)
├── LESSONS_LEARNED.md  # Key insights and recommendations (optional)
└── (optional)
    ├── config.yaml     # Experiment configuration
    └── figures/        # Plots and visualizations

Key Metrics¶

For Classification Experiments¶

Accuracy: Overall classification accuracy
AP (Average Precision): Per-class ranking quality
PR-AUC: Area under precision-recall curve

For Delta Prediction Experiments¶

Pearson r: Correlation with true deltas
Detection Rate: % of splice-altering variants detected
Mean |Δ|: Average absolute delta score

Quick Reference¶

Current Best Results¶

Task	Best Model	Metric	Value
Classification	Meta-Layer (001)	Accuracy	99.11%
Variant Detection	Validated Delta (004)	Correlation	r=0.41
Binary Classification	Multi-Step (003)	AUC	0.61

Key Findings¶

Classification ≠ Detection: High classification accuracy doesn't translate to variant detection
Training objective matters: Must train for the evaluation task
Target quality matters: Learning from potentially wrong base model deltas limits paired prediction
Validated targets work better: SpliceVarDB filtering improves correlation from r=0.38 to r=0.41
Binary classification is learnable: AUC=0.61 > random, but F1=0.53 needs improvement (>0.7)

How to Add a New Experiment¶

Create directory: NNN_experiment_name/
Copy template from existing experiment
Update README.md with hypothesis and setup
Run experiment, record results
Analyze and document insights
Update this index

ARCHITECTURE.md - Meta-layer architecture
LABELING_STRATEGY.md - Labeling approaches (planned)
methods/ - Methodology documentation

Meta-Layer Experiments¶

Experiment Index¶

Experiment Categories¶

Classification-Based Approaches¶

Delta-Based Approaches¶

Directory Structure¶

Key Metrics¶

For Classification Experiments¶

For Delta Prediction Experiments¶

Quick Reference¶

Current Best Results¶

Key Findings¶

How to Add a New Experiment¶

Related Documentation¶