System Design Documentation¶

Purpose: Architectural principles and design patterns for agentic-spliceai
Audience: Developers, contributors, and future maintainers
Status: Living documentation - updated during refactoring

Overview¶

This directory documents the architectural foundations of agentic-spliceai. These are design decisions that affect the entire system and should be understood before making significant changes.

System design documentation differs from implementation details: - System design (here): Why we made architectural choices, the principles behind them - Implementation: How we're implementing those designs during refactoring (see source code)

Core System Design Areas¶

1. Resource Management ¶

Why it matters: All code needs to access data, models, and genomic resources consistently

What it covers: - Centralized path resolution - Project root detection - Data directory structure - Configuration-driven resource access - Environment-agnostic portability

Key principle: Single source of truth for all paths

2. Output & Artifact Management ¶

Why it matters: Research code produces models, predictions, metrics, and experimental artifacts

What it covers: - Artifact lifecycle management - Experiment organization - Reproducibility tracking - Mode-based isolation (production/dev/test) - Immutability policies

Key principle: Systematic organization prevents "where did I save that?" chaos

3. Configuration System ¶

Why it matters: Supports multiple genome builds, base models, and execution environments

What it covers: - YAML-based configuration - Environment variable overrides - Type-safe config with dataclasses - Build-specific settings - Multi-environment support (local, remote, CI/CD)

Key principle: Configuration, not code changes, for different setups

4. Base Layer Architecture ¶

Why it matters: Foundation for splice site prediction with extensible base model support

What it covers: - Abstract base model interface - Coordinate system conventions - Prediction workflow pipeline - Data preparation architecture - Chunking and checkpointing

Key principle: Abstract interface allows pluggable base models

5. Meta Layer Architecture ¶

Why it matters: Multimodal meta-learning for adaptive splice prediction

What it covers: - Foundation-Adaptor framework - Context integration patterns - Training and inference pipelines - Feature engineering - Performance optimization

Key principle: Meta-learning adapts base predictions to specific contexts

Design Principles¶

These principles guide all architectural decisions:

1. Separation of Concerns¶

Each component has a single, well-defined responsibility: - genomic_config.py → Path resolution - ArtifactManager → Output management
- BaseModel → Prediction interface

Why: Makes code easier to understand, test, and modify

2. Single Source of Truth¶

Critical information has exactly one authoritative location: - Paths → Configuration system - Schemas → Type definitions - Defaults → YAML config files

Why: Prevents inconsistencies and "which version is correct?" confusion

3. Configuration Over Code¶

Use configuration files for environment-specific settings: - Data paths → settings.yaml - Model versions → Environment variables - Build selection → Config, not hardcode

Why: Same code runs everywhere without modification

4. Explicit Over Implicit¶

Make dependencies and requirements visible: - Type annotations everywhere - Explicit imports - Named parameters over positionals - Clear error messages

Why: Code is self-documenting and errors are understandable

5. Progressive Disclosure¶

Expose simple interfaces, hide complex implementation: - Public API → Simple functions - Internal → Complex logic - Defaults → Common cases - Options → Advanced use

Why: Easy to start, powerful when needed

6. Fail Fast, Fail Clearly¶

Detect problems early with helpful messages: - Validate inputs immediately - Check file existence before processing - Descriptive error messages with solutions - Type checking at boundaries

Why: Problems are caught early when they're easy to fix

7. Testability¶

Design for easy testing: - Pure functions where possible - Dependency injection - Mock-friendly interfaces - Isolated test environments

Why: Confidence in changes, faster development

Refactoring Context¶

These design documents inform the ongoing refactoring from meta-spliceai to agentic-spliceai.

Current Phase: Base Layer Refactoring¶

We're applying these principles to: - Extract base layer from monolithic workflow - Implement resource management system - Create artifact management infrastructure - Establish configuration foundations

Status: In progress — see source code under src/agentic_spliceai/

How to Use This Documentation¶

Before Making Architectural Changes¶

Read relevant design document to understand current architecture
Understand the principles behind the design
Propose changes that align with principles
Update documentation if design evolves

When Adding New Components¶

Follow established patterns documented here
Apply design principles consistently
Document significant decisions in relevant design doc
Add examples to help future developers

When Debugging Design Issues¶

Check if issue violates a design principle
Review design docs for intended architecture
Consider if refactoring would prevent similar issues
Update docs with lessons learned

Document Status¶

Document	Status	Last Updated
Resource Management	✅ Complete	Feb 15, 2026
Output Management	✅ Complete	Feb 15, 2026
Configuration System	✅ Complete	Feb 15, 2026
Base Layer Architecture	📝 Draft	TBD
Meta Layer Architecture	📝 Draft	TBD

Code Documentation¶

src/agentic_spliceai/splice_engine/config/ - Configuration implementation
STRUCTURE.md - Project structure overview

User Guides¶

SETUP.md - Setup and configuration
QUICKSTART.md - Getting started

Contributing¶

When contributing architectural changes:

Discuss design in issues/PRs before implementing
Follow existing patterns documented here
Update design docs to reflect changes
Add examples for new patterns
Explain rationale for design decisions

Questions?¶

Design questions: See relevant design document in this directory
Implementation questions: See source code under src/agentic_spliceai/
Usage questions: See QUICKSTART.md or SETUP.md

Last Updated: February 15, 2026
Maintainer: agentic-spliceai team
Status: ✅ Core design documents complete, specialized docs in progress