The Repository Should Match the Way the Software Ships10fd32f

By

Most arguments about monorepos start in the wrong place. They usually start with tooling: build speed, dependency management, whether Google does it, whether the company has enough platform engineers to make it work. Those things matter, but they are not the heart of the issue. The heart of the issue is coordination.

Monorepos solve a real problem, and granular repos create a real problem, and both of those need to be said up front, because the lazy version of this conversation turns into another round of engineering religion. One camp acts like putting everything in one repository magically fixes software delivery, and the other camp acts like smaller repositories automatically create cleaner systems. Neither of those is true.

A repository structure is not just a place to put code. It is a delivery model, and it says something about ownership, dependencies, testing, release flow, deployment risk, and how much coordination people need before they can safely ship. When we forget that, we end up arguing about folder structure while the real system is breaking somewhere else.

We Have Been Here Before

It is worth being honest about where we came from. Back in the CVS, Visual SourceSafe, and SVN days, most repositories were effectively monorepos, and before that a lot of systems were natural monorepos: big folder structures on shared drives with piles of code, scripts, documents, build outputs, install notes, and whatever else the team needed to keep the system alive. Nobody called them monorepos because there was nothing special to call them. That was just where the code lived.

Releases were painful because everything was tied together, builds were fragile, and ownership was often unclear. A change in one area could break some unrelated part of the system, and the person who discovered it was usually the person trying to ship. Waterfall release cycles were not always a philosophical choice, because in a lot of places they were the only sane-looking option when the whole system moved as one large, risky object. You bundled everything up, tested everything you could, hoped the release notes were accurate, scheduled the deployment window, put people on standby, and pulled out rollback instructions that may or may not have been tested recently. Then everyone held their breath.

It was not elegant, but it made a certain kind of sense. When the code, the build, the release, and the organization were all tangled together, the release process became a ceremony around that risk. So when teams started breaking systems into smaller repositories, that also made sense. Smaller repos felt like progress because in many ways they were progress: they gave teams clearer boundaries, reduced local build pain, made ownership easier to describe, and let teams ship smaller pieces independently, at least in theory. They made it possible to stop treating the entire software estate as one release object, and that was a real improvement.

But the coordination problem did not disappear. It moved.

Granular Repos Hide Coupling

The problem with lots of small repositories is that they can make the system look cleaner than it really is. You may have one repo for the frontend, another for the API, another for a shared client library, and then more for infrastructure, database migrations, background workers, and some internal tool nobody remembers until it breaks. On paper that looks like separation.

In practice, a single feature may require changes across five of those repositories, and they have to come together in the right order. The library has to be published before the service can consume it, the service has to be deployed before the frontend can rely on the new behavior, the database migration has to be compatible with both the old and new code, and the infrastructure change has to land before the service starts using the new dependency. The coupling is still there. It is just hidden across repositories, package versions, pipeline timing, and Slack threads.

That creates a different kind of pain. Instead of one large codebase with visible coupling, you get many smaller codebases with hidden coupling, and instead of one painful release train you get dependency version drift, duplicated libraries, stale contracts, incompatible changes, and pull requests spread across repositories that all have to land in sequence. Anyone who has chased a cross-repo change through a dependency chain knows how this goes: the actual code change is small, and the coordination around it is what burns the week.

This is where monorepos start to look attractive again, and for good reason. A monorepo can make cross-cutting changes easier, reduce dependency pinball, make shared code more visible, and allow larger refactors that would be nearly impossible across many repositories. It can also expose the fact that systems are more connected than the org chart wants to admit, and that last part is the one that matters. A monorepo makes coupling visible, but it does not make coupling good, and visibility is not the same thing as discipline.

The Monorepo Failure Mode

The failure mode I worry about with monorepos is the one-size-fits-all effect. At first, standardization sounds like the whole point: one repository, one build system, one pipeline pattern, one way to run tests, one way to deploy. It is easy to sell because the alternative looks messy, and nobody wants seventy repositories with seventy different deployment stories.

But real systems push back, because different things have different lifecycles. A web application, a backend service, a shared library, a Terraform module, a database migration tool, a data pipeline, and a mobile app do not all behave the same way just because they live under the same root folder. They have different runtime requirements, testing strategies, release approvals, rollback paths, versioning rules, and operational risks, and often different teams carrying the pager after the code ships.

That is where the monorepo starts to grow exceptions, and the exceptions follow a pattern. Everything works the same, except for this service. Everything deploys through the same pipeline, except for that package. Everything runs the same test workflow, except for the legacy module that takes forty minutes. Everything follows the standard process, except for the one thing everyone is afraid to touch. The process still looks standardized from a distance, but up close it is a pile of conditional logic and tribal knowledge. That is not simplicity. It is complexity with a company-approved label.

This gets especially dangerous once CI/CD enters the picture, because CI/CD turns repository structure into daily behavior. It is one thing to say code can live together, and another to prove that teams can build, validate, deploy, and roll back their work without stepping on each other. A deployment pipeline is an organizational contract. It defines what has to be true before code is allowed to move forward, and when everything lives in one repository that contract has to serve many different kinds of work at once. Too loose and it does not protect production, too strict and it slows down teams that should be able to move independently, and with too many exceptions nobody really understands the contract anymore. That is when the people factor shows up.

The People Factor Is Not Optional

Engineers like to talk about monorepos as if the problem is mostly mechanical: a better build graph, smarter test selection, better caching, better tooling, better ownership metadata. All of that helps, and none of it removes the human problem. When a monorepo build breaks, who owns it? When one team's flaky test blocks another team's release, who feels the pain, and who is responsible for proving the downstream systems still work after a shared package changes? When a team needs a pipeline exception, who decides whether it is reasonable or whether the system is quietly becoming unmaintainable? When every team shares the same repository, the social contract matters as much as the technical one.

This is where monorepos can become stressful, not because the idea is bad, but because they compress a lot of organizational behavior into one visible place. Teams that were already aligned tend to benefit from that. Teams that were not aligned may suddenly discover that their boundaries were more aspirational than real, and that discovery can be useful, but it is not free. A monorepo will not create ownership where ownership does not already exist. It will only reveal the lack of it faster.

That is why I do not like treating a monorepo as an architecture strategy by itself. A monorepo is not architecture, it is a forcing function. It can force shared standards, dependency visibility, and build and test discipline, and it can force teams to confront coupling they had been ignoring. But it cannot decide what should be independently deployable, who owns production, whether a shared library should be versioned separately, or whether a service should be allowed to release without coordinating with half the company. Those are architecture and ownership decisions, and the repository should reflect them, not replace them.

Size Repositories Around Deployables and Publishables

My default preference is simple. Size repositories at the level of a deployable or a publishable. That does not mean every class, module, Lambda function, or small service needs its own repository, because that way lies madness. Too many tiny repos create their own tax, turning simple changes into dependency scavenger hunts and making every improvement feel like a multi-step release campaign.

What I want is for the boundary to line up with the way the software actually moves through the world. If something deploys independently, gets published as a versioned package, or carries its own release lifecycle, rollback path, operational owner, and production risk profile, then that is a real boundary, and the repository should make it clear rather than burying it inside a larger structure just because the organization wanted fewer repos to look at.

In practice that means a backend service with its own deployment pipeline probably deserves a repository boundary, and a shared library that is published and consumed by multiple systems may deserve one too. An application made of internal modules that always build, test, and deploy together can live happily in one repo, and so can a cluster of packages that are tightly coupled and released as a unit. The point is not to invent a universal rule. The point is to make the repository boundary tell the truth.

When the repo boundary matches the delivery boundary, CI/CD gets much easier to reason about. The pipeline validates the thing that will actually ship, the tests protect the deployable, the release process is owned by the team responsible for the outcome, rollback is not tangled up with unrelated code, and a broken build has a clear owner. That is boring in the best possible way, and boring production is a feature. Boring delivery starts with boundaries people can actually understand.

Standardization Should Not Mean Pretending Everything Is the Same

None of this is an argument against standardization. If anything it is the opposite, because most organizations need more consistency, not less. They need common pipeline patterns, quality gates, security checks, observability expectations, rollback discipline, and artifact handling, along with common ways to publish packages and deploy services. But common does not mean identical.

A good platform gives teams paved roads. It makes the right thing easy and the risky thing visible, and it reduces unnecessary variation without pretending that a frontend app, an API, an infrastructure module, and a shared library all have the same lifecycle. Bad standardization says everything must work the same way. Good standardization says the important parts must be consistent and the differences must be explicit, and that distinction is most of the game.

So if a service needs a special deployment path, make it visible. If a package has different versioning rules, document and automate them. If a legacy module has a slow test suite, the answer is not to hide it under a pipeline exception forever, but to decide whether the cost is acceptable, who owns it, and when it gets fixed. Exceptions are not the problem. Hidden exceptions are the problem. The real danger in both monorepos and many-repo systems is the same, which is that the actual process gets buried. In a many-repo world it hides in dependency chains and release coordination, and in a monorepo world it hides in conditional pipeline logic and unwritten rules. Different structure, same disease.

The Better Question

So the better question is not whether to use a monorepo. It is where the coordination pain should live, because every repository strategy has a cost. A monorepo centralizes that coordination: it can make dependencies visible, simplify broad changes, and enforce shared standards, and it can also create pipeline complexity, noisy builds, ownership confusion, and pressure to make unlike things behave alike. Many repos push the coordination outward: they can give teams autonomy, clarify deployable boundaries, and keep pipelines focused, and they can also hide coupling, multiply dependency management, duplicate tooling, and make cross-cutting changes painful.

There is no free answer here. There is only the answer that matches the system you actually have and the system you are disciplined enough to operate, and that second half is the part teams skip. Do not choose a monorepo because you want to feel like a big engineering organization, and do not choose many repos because you want to feel like every team is autonomous. Choose the structure that reflects your real ownership model, delivery model, and operational reality.

If teams release together, share a lifecycle, and need frequent cross-cutting changes, a monorepo may be exactly right. If teams own independent deployables with different runtimes, different risks, and different release paths, forcing them into one repository tends to create more confusion than clarity. When a shared library is versioned and published, treat that publishable lifecycle as real, and when a service is deployed and rolled back on its own, treat that deployable lifecycle as real too. The system is what happens after the code ships, and your repository structure should not lie about that.

The Takeaway

Monorepos are not bad, and granular repos are not automatically good. Both are tools for managing coordination, and both fail in the same way: when they are used to avoid the harder conversations about ownership and delivery. The old shared-drive systems taught us that putting everything together can create painful releases and unclear ownership. The many-repo era taught us that splitting everything apart can hide coupling and turn simple changes into coordination work. The current monorepo trend is a reaction to that second lesson, and it genuinely solves part of it, but it can also recreate the older problem with newer tooling.

When a monorepo tries to make everything one size fits all, the organization pays for it in process stress, and CI/CD is where that stress shows up, because pipelines are the place architecture, ownership, and operational discipline stop being opinions and start becoming facts. That is why my bias is to keep repositories sized around deployables and publishables. It is not a perfect rule, but it is a useful default, because it keeps the repo boundary close to the delivery boundary, keeps ownership closer to production, and keeps pipelines focused on the thing that actually ships.

A monorepo can absolutely be the right answer when it reflects a real shared lifecycle. It becomes dangerous when it forces unrelated lifecycles to pretend they are the same. The goal was never fewer repos or more repos. The goal is a software delivery system where the boundaries are honest, the ownership is clear, and the path to production does not depend on heroics or folklore. The repository should match the way the software ships.

Frequently asked questions

Is a monorepo better than many small repositories?

Neither is better by default. Both are tools for managing coordination, and both can fail when they are used to avoid harder conversations about ownership and delivery. A monorepo centralizes coordination and makes coupling visible; many repos decentralize it and can hide coupling. The right answer is the one that matches the system you actually have and are disciplined enough to operate.

Why isn't repository structure mainly a tooling question?

Because a repository structure is a delivery model, not just a place to put code. It encodes ownership, dependencies, testing, release flow, deployment risk, and how much coordination people need before they can safely ship. Build speed and dependency management matter, but the heart of the issue is coordination, and arguing only about tooling means arguing about folder structure while the real system breaks somewhere else.

How should I decide where to draw a repository boundary?

Size repositories around deployables and publishables. If something deploys independently, is published as a versioned package, or has its own release lifecycle, rollback path, and operational owner, that usually deserves a repository boundary. The goal is to make the repo boundary match the delivery boundary so the pipeline validates the thing that actually ships.

What is the main failure mode of a monorepo?

The one-size-fits-all effect. When unlike lifecycles (a web app, a service, a shared library, an infrastructure module) are forced through one pipeline, the process slowly fills with exceptions and tribal knowledge. It still looks standardized from a distance, but up close it becomes conditional logic nobody fully understands. That is complexity with a company-approved label.

Does a monorepo fix ownership problems?

No. A monorepo is a forcing function, not architecture. It can force shared standards, dependency visibility, and build discipline, and it will reveal a lack of ownership faster. But it cannot create ownership where none exists, decide what should be independently deployable, or decide who owns production. Those remain architecture and ownership decisions.

Should standardization mean every team works the same way?

No. Common does not mean identical. Good standardization makes the important parts consistent, things like pipeline patterns, quality gates, security checks, and rollback discipline, while making the differences explicit. Exceptions are not always bad, but hidden exceptions are. The disease in both monorepos and many-repo systems is the same: burying the real process, whether in dependency chains or in conditional pipeline logic.

Conversation

    Log in to join the conversation.

    © 2026 ABWaters. Made quietly.