Validated Delta Prediction addresses a fundamental limitation of paired (Siamese) prediction: base model deltas may be inaccurate for non-splice-altering variants.
This approach uses SpliceVarDB classifications to filter/validate training targets, ensuring the model learns from ground truth rather than potentially incorrect base model predictions.
Paired Prediction (Previous Approach):
Target = base_model(alt) - base_model(ref)
Issue: If variant is NOT splice-altering but base model predicts
a delta anyway, we're training on wrong labels!
Validated Delta Prediction:
If SpliceVarDB says "Splice-altering":
Target = base_model(alt) - base_model(ref) # Trust base model
If SpliceVarDB says "Normal":
Target = [0, 0, 0] # Override base model - no effect!
If SpliceVarDB says "Low-frequency" or "Conflicting":
SKIP # Uncertain, don't train on it
fromagentic_spliceai.splice_engine.meta_layer.modelsimport(ValidatedDeltaPredictor,create_validated_delta_predictor)# Create modelmodel=create_validated_delta_predictor(variant='basic',# or 'attention' for interpretabilityhidden_dim=128,n_layers=6,dropout=0.1)# Training configconfig={'epochs':40,'batch_size':32,'learning_rate':5e-5,'weight_decay':0.02,'scheduler':'OneCycleLR'}