Codebase Architecture

Codebase Architecture#

Separation of Concerns (SoC) A structural foundation that physically isolates distinct operational domains: code, configuration, environment, infrastructure, and data.

🧠 src/ (Core Logic)

Contains the generalized, reusable Python code. Functions are agnostic to state and wait to be imported.

🚀 scripts/ (Operational Execution)

Procedural entry points (train.py). They rely on environmental variables, parse configs, and apply them to src/ logic.

📓 notebooks/ (Exploration)

Reserved exclusively for EDA and prototyping. Imports from src/, but is never imported itself.

📦 containers/ (Infrastructure)

Houses declarative blueprints (Dockerfile, Apptainer.def), keeping the repository root clean.

💾 data/

Strictly version-controlled directory containing raw and processed datasets, isolated from execution logic.

This structure allows the exact same src/ logic to be executed interactively in notebooks/ or automatically via scripts/, within a reproducible environment defined in containers/.