A deep dive into the CI/CD pipeline changes, caching strategies, and tooling decisions that halved our average build time.
Our CI pipeline used to take 22 minutes on average. Today it takes 8. This is the story of how we got there — and the dead ends we hit along the way.
Where we started
Like most teams, we'd accumulated years of CI debt. Tests that weren't parallelised. Docker layers that rebuilt from scratch every time. An integration test suite that ran every build even when only CSS had changed.
The first thing we did was measure. We instrumented every step and surfaced a flame graph of our build. The two biggest offenders:
- npm install: 6 minutes (!!)
- Integration tests: 9 minutes, running serially
The fixes
Caching node_modules. This sounds obvious but our cache keys were wrong — we were keying on the entire package-lock.json content hash, which changed whenever any package updated. We switched to hashing only the engines and top-level dependencies. Cache hit rate went from 20% to 85%.
Parallelising tests. We split our integration suite into four buckets by domain (auth, billing, core, notifications) and ran them in parallel. 9 minutes became 3.
Selective test runs. We built a simple script that maps changed files to affected test files. A CSS-only change no longer triggers the full integration suite. This eliminated ~30% of full runs entirely.
What didn't work
We tried a remote build cache (Turborepo + S3). The setup took two days and the cache hit rate in CI was 40% — not worth the complexity. We rolled it back.
The result
22 minutes → 8 minutes. Engineers merge more often, deploys are less scary, and Friday afternoon deployments are no longer a team-wide anxiety event.