First CI, Then CD8113e39

By

A lot of teams say they want CI/CD, but what they usually mean is that they want the deploy button to hurt less. That is understandable, because painful releases wear people down. Nobody enjoys release weekends, deployment checklists, manual coordination calls, or sitting around waiting to find out whether the thing they just shipped is about to break production.

The mistake is thinking that CD is the first problem to solve. It usually is not. The first problem is CI.

Before you can continuously deliver software, you have to continuously integrate it. That sounds obvious, but a lot of teams skip right past it. They try to automate deployments while still carrying long-running branches, unclear release boundaries, manual test gates, and code that gets merged before anyone can honestly say it is ready for production. That is not CI/CD. That is a faster way to move uncertainty around.

If you are not already running a full pipeline, I would start with two things: trunk-based development and automation. Both require technology, but neither one is mainly a tooling problem. They are mindset changes.

Trunk-based development changes the meaning of "done"

Trunk-based development sounds simple. Keep branches short-lived, merge frequently, and keep the main branch healthy. In practice it changes how a team thinks about unfinished work.

In a trunk-based workflow, code is developed in a production-ready manner. That does not mean every feature is visible to users the moment the code merges. It means the code on the main branch must always be in a state where it can be released, and that part is not negotiable.

There are really only two acceptable states for a change. Either the feature is done and ready to release, or the feature is not done but it is safely wrapped behind a feature flag so users cannot reach it before it is ready. There is no third state where the code gets merged because it is "mostly done" and everyone promises to clean it up before release. That is how teams turn the main branch into a staging area for unfinished thought, and once that happens, CI stops meaning anything useful. The main branch should not be a junk drawer.

This is hard for teams used to long-running branches. Long-running branches feel safe because they keep unfinished work away from the main branch, but they create a different kind of risk. They delay integration until the worst possible time, when everyone is under pressure and the differences between branches have had time to grow teeth. You do not avoid integration risk by avoiding integration. You just save it up. Trunk-based development forces that risk into the open earlier, and that is uncomfortable at first, but it is also the point. The system gets healthier when integration is a normal daily activity instead of a dramatic event near the end of a release.

Feature flags are not optional decoration

If a team is serious about trunk-based development, feature flags become part of the basic toolkit. They are not a fancy release strategy, they are how you separate code deployment from feature release, and that distinction matters.

Without feature flags, teams tend to delay merging until the entire feature is ready, which leads to larger branches, bigger pull requests, slower reviews, and more painful integration. With feature flags, a team can merge small slices of production-ready code without exposing unfinished behavior to users.

The flag is not a license to merge broken code. It is a way to keep incomplete product behavior hidden while the code itself stays safe to deploy. A feature hidden behind a flag should still compile, still pass tests, and still respect the shape of the system. It should not create operational risk just because the user cannot see it yet. The goal is not to hide bad work. The goal is to make good work easier to integrate in smaller pieces.

Automation is a signal, not a trophy

The second major step is automation. I have released software with manual testing, with automated testing, with a mixture of both, and in some cases with almost no meaningful testing at all. People sometimes call that last one "customer testing," usually with a nervous laugh, because everyone knows what it really means. The customer finds the problem first.

Most teams know they need more automation. Where they get stuck is believing they need perfect automation before they can make progress, and that is usually wrong. The idea that you need 100% coverage before automation becomes useful sounds responsible, but in real systems it tends to be paralyzing. It turns automation into a massive cleanup project that nobody has time to finish, the work becomes so large and abstract that teams either avoid it or build a brittle suite nobody trusts.

You do not need perfect coverage to improve the system. You need useful coverage in the places where it reduces risk. The better starting rule is simple: all new work should include automation. Do not make the backlog worse. Do not keep adding untested behavior while waiting for some future project to clean everything up. Every new feature, bug fix, and meaningful change should leave the system slightly better tested than it was before. That is how you climb out of the hole.

Unit tests should do most of the work

Most of the automated testing should be unit testing. That does not mean unit tests solve every problem, because they do not, but they should make up the body of the suite because they are fast, focused, and cheap to run.

A good unit test checks local behavior. It should not need a database, an external service, a network call, or a fully running application stack, and its dependencies should usually be mocked. The test asks a narrow question: given this input, and assuming this dependency returns this result, does this unit behave correctly? You are not trying to prove the whole system works in every unit test. You are trying to prove that each piece behaves correctly under specific conditions.

When unit tests become slow, fragile, or too dependent on the outside world, teams stop trusting them, and once that happens they become ceremony. They still run in the pipeline, but nobody believes the signal. Fast tests get run, useful tests get fixed, and fragile tests get ignored. That is not a tooling issue, it is a feedback issue.

Use integration and contract tests where they pay for themselves

Some behavior cannot be proven with unit tests alone, and that is where integration tests and contract tests earn their place. Integration tests are useful when the risk sits between components. Maybe the application code and the database schema need to agree, maybe a service call needs to serialize data in the expected shape, or maybe two internal modules work fine alone but fail when wired together.

Contract tests are useful when systems depend on each other across boundaries. They answer a practical question: can this consumer and this provider still work together without requiring everyone to deploy everything at the same time? That is the kind of automation that actually supports CI.

The goal is not to add every possible kind of test because a testing pyramid diagram told you to. The goal is to put the right tests at the right risk points. If a test does not produce a signal the team will act on, it is probably not worth much.

End-to-end tests should protect the main roads

End-to-end tests are where teams often get themselves in trouble. They are valuable, but they are expensive. They are slower, they are more fragile, and they tend to fail for reasons that have nothing to do with the code change being tested. If you try to cover every setting, every edge case, and every product variation through end-to-end automation, you can build a test suite that becomes its own production system, and that is usually a bad trade.

For end-to-end tests I prefer simple coverage of the primary user flows. Cover the 80% case, cover the paths most users depend on, and cover the flows where a failure would create obvious customer pain. Do not try to prove every detail through the browser. That does not mean details do not matter, it means the end-to-end layer is the wrong place to check all of them. Push detailed logic down into unit tests, use integration tests where components meet, and use end-to-end tests to make sure the main roads are open. The point is to catch the failures that would hurt the most users the fastest, and a small end-to-end suite that people trust beats a huge one everyone works around.

Manual testing does not disappear immediately

None of this means manual testing goes away on day one. A lot of teams are supporting systems that were never built with strong automated coverage, and pretending that manual testing can vanish immediately is not leadership, it is wishful thinking. The transition has to be honest.

Manual testing may still be needed for a while, especially around legacy areas, high-risk releases, complicated user workflows, or places where automation has not caught up yet. But it should stop being the permanent safety net for everything. The team should be steadily moving risk out of people's heads and into repeatable checks, and that is the important shift.

Manual testing should become more targeted over time. It should focus on judgment, exploration, and areas of genuine uncertainty. It should not be the only thing standing between a code change and a customer-impacting defect that could have been caught the same way every time. If the same manual test is being repeated every release, that is usually a candidate for automation.

CI is the discipline that makes CD safe

Once trunk-based development and meaningful automation are in place, CD becomes a much more reasonable goal. At that point the deploy pipeline is not being asked to perform magic. It is taking a known-good state of the main branch and moving it through a controlled path to production. That is very different from trying to automate deployment while the team is still guessing whether the code is safe.

CI gives you the signal, and CD acts on the signal. If the signal is weak, CD just helps you ship uncertainty faster. This is why "First CI, then CD" matters. Continuous delivery depends on continuous integration being real. Not aspirational, not partial, not a label on a pipeline. Real. The main branch must be releasable. The tests must be fast enough and useful enough to trust. The team must know how unfinished work stays hidden from users without poisoning the release path. The pipeline must represent the team's actual agreement about quality, not just a collection of steps somebody copied from another project.

Start smaller than you want to

The practical path is not complicated, but it does require discipline. Move toward trunk-based development. Keep branches short. Require production-ready code at merge time. Use feature flags for incomplete features. Add automation to all new work. Build unit tests first, use integration and contract tests where the risk justifies them, and keep end-to-end tests focused on the primary user flows.

Do not wait for the perfect test suite, do not wait for the perfect repository structure, and do not wait for a massive transformation program to bless common sense. Start by making the main branch trustworthy. That is the first real milestone.

A full CI/CD pipeline is not just a deployment toolchain. It is a working agreement about how software moves from idea to production without creating avoidable risk for customers. You do not get there by automating chaos. You get there by making integration boring first, and then delivery has something solid to stand on.

Frequently asked questions

Why should CI come before CD?

Because continuous delivery depends on continuous integration being real. CD takes a known-good state of the main branch and moves it through a controlled path to production. If the team is still guessing whether the code is safe, automating deployment does not remove that uncertainty, it just ships it faster. CI produces the signal, CD acts on the signal, and a weak signal makes a fast pipeline dangerous rather than helpful.

Do I need 100% test coverage before automation is worth it?

No, and waiting for it usually backfires. Demanding perfect coverage first turns automation into a massive cleanup project nobody finishes, and teams either avoid it or build a brittle suite they do not trust. You need useful coverage where it reduces risk, not total coverage. The practical rule is that all new work should include automation, so every change leaves the system slightly better tested than before.

Are feature flags required for trunk-based development?

Effectively, yes. Feature flags are how you separate code deployment from feature release, which is what lets you merge small slices of production-ready code without exposing unfinished behavior to users. Without them, teams delay merging until a whole feature is done, which means bigger branches, slower reviews, and more painful integration. A flag is not permission to merge broken code: flagged code should still compile, pass tests, and avoid operational risk.

What kinds of tests should make up most of the suite?

Most of it should be unit tests, because they are fast, focused, and cheap to run. They check local behavior with dependencies mocked, asking whether a unit behaves correctly under specific conditions rather than trying to prove the whole system works. Use integration and contract tests at the risk points between components and across service boundaries, and keep end-to-end tests for the primary user flows. Put the right tests at the right risk points instead of chasing a pyramid diagram.

What belongs in end-to-end tests?

The main roads. Cover the primary user flows, the 80% case, and the paths where a failure would create obvious customer pain. End-to-end tests are slow and fragile and often fail for reasons unrelated to the change, so trying to prove every edge case through the browser builds a suite that becomes its own production system. Push detailed logic down into unit tests and use integration tests where components meet. A small suite people trust beats a huge one everyone works around.

When can manual testing go away?

Not on day one, and pretending otherwise is wishful thinking rather than leadership. Many teams support systems that were never built with strong automated coverage, so manual testing stays useful for a while around legacy areas, high-risk releases, and complicated workflows. The goal is to move risk out of people's heads and into repeatable checks over time, so manual testing becomes more targeted and focuses on judgment and exploration. If the same manual test runs every release, automate it.

Conversation

    Log in to join the conversation.

    © 2026 ABWaters. Made quietly.