Delimination

Delimination#

One repository keeps things simple, but can break down at scale:

Distribution & Licensing

Code might be open-source while a dataset it depends on prohibits redistribution.

Complexity & Bloat

Large repositories get slow, the history becomes noisy, and relationships between parts become unclear.

Solution: Link focused repositories with git submodules.

When to create a separate repository:

  • An element (e.g., a curated dataset) is broadly useful in isolation for entirely different projects.

  • An element (e.g., public-facing documentation or a web dashboard) needs to evolve or be deployed independently of the main analytical code.

  • An element is subject to different ownership, access rights, or terms of use.

Manuscripts

Papers, theses, and book chapters deserve their own repository. Include the analysis repo as a submodule to reference generated figures and tables.