Delimination#
One repository keeps things simple, but can break down at scale:
Code might be open-source while a dataset it depends on prohibits redistribution.
Large repositories get slow, the history becomes noisy, and relationships between parts become unclear.
Solution: Link focused repositories with git submodules.
When to create a separate repository:
An element (e.g., a curated dataset) is broadly useful in isolation for entirely different projects.
An element (e.g., public-facing documentation or a web dashboard) needs to evolve or be deployed independently of the main analytical code.
An element is subject to different ownership, access rights, or terms of use.
Manuscripts
Papers, theses, and book chapters deserve their own repository. Include the analysis repo as a submodule to reference generated figures and tables.