System Design Documentation¶
Purpose: Architectural principles and design patterns for agentic-spliceai
Audience: Developers, contributors, and future maintainers
Status: Living documentation - updated during refactoring
Overview¶
This directory documents the architectural foundations of agentic-spliceai. These are design decisions that affect the entire system and should be understood before making significant changes.
System design documentation differs from implementation details: - System design (here): Why we made architectural choices, the principles behind them - Implementation: How we're implementing those designs during refactoring (see source code)
Core System Design Areas¶
1. Resource Management¶
Why it matters: All code needs to access data, models, and genomic resources consistently
What it covers: - Centralized path resolution - Project root detection - Data directory structure - Configuration-driven resource access - Environment-agnostic portability
Key principle: Single source of truth for all paths
2. Output & Artifact Management¶
Why it matters: Research code produces models, predictions, metrics, and experimental artifacts
What it covers: - Artifact lifecycle management - Experiment organization - Reproducibility tracking - Mode-based isolation (production/dev/test) - Immutability policies
Key principle: Systematic organization prevents "where did I save that?" chaos
3. Configuration System¶
Why it matters: Supports multiple genome builds, base models, and execution environments
What it covers: - YAML-based configuration - Environment variable overrides - Type-safe config with dataclasses - Build-specific settings - Multi-environment support (local, remote, CI/CD)
Key principle: Configuration, not code changes, for different setups
4. Base Layer Architecture¶
Why it matters: Foundation for splice site prediction with extensible base model support
What it covers: - Abstract base model interface - Coordinate system conventions - Prediction workflow pipeline - Data preparation architecture - Chunking and checkpointing
Key principle: Abstract interface allows pluggable base models
5. Meta Layer Architecture¶
Why it matters: Multimodal meta-learning for adaptive splice prediction
What it covers: - Foundation-Adaptor framework - Context integration patterns - Training and inference pipelines - Feature engineering - Performance optimization
Key principle: Meta-learning adapts base predictions to specific contexts
Design Principles¶
These principles guide all architectural decisions:
1. Separation of Concerns¶
Each component has a single, well-defined responsibility:
- genomic_config.py → Path resolution
- ArtifactManager → Output management
- BaseModel → Prediction interface
Why: Makes code easier to understand, test, and modify
2. Single Source of Truth¶
Critical information has exactly one authoritative location: - Paths → Configuration system - Schemas → Type definitions - Defaults → YAML config files
Why: Prevents inconsistencies and "which version is correct?" confusion
3. Configuration Over Code¶
Use configuration files for environment-specific settings:
- Data paths → settings.yaml
- Model versions → Environment variables
- Build selection → Config, not hardcode
Why: Same code runs everywhere without modification
4. Explicit Over Implicit¶
Make dependencies and requirements visible: - Type annotations everywhere - Explicit imports - Named parameters over positionals - Clear error messages
Why: Code is self-documenting and errors are understandable
5. Progressive Disclosure¶
Expose simple interfaces, hide complex implementation: - Public API → Simple functions - Internal → Complex logic - Defaults → Common cases - Options → Advanced use
Why: Easy to start, powerful when needed
6. Fail Fast, Fail Clearly¶
Detect problems early with helpful messages: - Validate inputs immediately - Check file existence before processing - Descriptive error messages with solutions - Type checking at boundaries
Why: Problems are caught early when they're easy to fix
7. Testability¶
Design for easy testing: - Pure functions where possible - Dependency injection - Mock-friendly interfaces - Isolated test environments
Why: Confidence in changes, faster development
Refactoring Context¶
These design documents inform the ongoing refactoring from meta-spliceai to agentic-spliceai.
Current Phase: Base Layer Refactoring¶
We're applying these principles to: - Extract base layer from monolithic workflow - Implement resource management system - Create artifact management infrastructure - Establish configuration foundations
Status: In progress — see source code under src/agentic_spliceai/
How to Use This Documentation¶
Before Making Architectural Changes¶
- Read relevant design document to understand current architecture
- Understand the principles behind the design
- Propose changes that align with principles
- Update documentation if design evolves
When Adding New Components¶
- Follow established patterns documented here
- Apply design principles consistently
- Document significant decisions in relevant design doc
- Add examples to help future developers
When Debugging Design Issues¶
- Check if issue violates a design principle
- Review design docs for intended architecture
- Consider if refactoring would prevent similar issues
- Update docs with lessons learned
Document Status¶
| Document | Status | Last Updated |
|---|---|---|
| Resource Management | ✅ Complete | Feb 15, 2026 |
| Output Management | ✅ Complete | Feb 15, 2026 |
| Configuration System | ✅ Complete | Feb 15, 2026 |
| Base Layer Architecture | 📝 Draft | TBD |
| Meta Layer Architecture | 📝 Draft | TBD |
Related Documentation¶
Code Documentation¶
src/agentic_spliceai/splice_engine/config/- Configuration implementation- STRUCTURE.md - Project structure overview
User Guides¶
- SETUP.md - Setup and configuration
- QUICKSTART.md - Getting started
Contributing¶
When contributing architectural changes:
- Discuss design in issues/PRs before implementing
- Follow existing patterns documented here
- Update design docs to reflect changes
- Add examples for new patterns
- Explain rationale for design decisions
Questions?¶
- Design questions: See relevant design document in this directory
- Implementation questions: See source code under
src/agentic_spliceai/ - Usage questions: See QUICKSTART.md or SETUP.md
Last Updated: February 15, 2026
Maintainer: agentic-spliceai team
Status: ✅ Core design documents complete, specialized docs in progress