When Agencies Build Their Own Tools: Two Cases From Our Stack in 2026
There is a familiar pattern in agency operations: you adopt a commercial tool because it solves 80% of the problem, then spend the next two years working around the remaining 20%. Eventually the workarounds accumulate, the friction compounds, and someone on the team says the quiet part out loud. We could just build this.
Building is usually the wrong call. Commercial tools exist because someone else has already solved most of the hard problems and continues to maintain the solution. The math almost always favors paying the subscription. Almost always.
But there are two categories where it stops making sense to buy: infrastructure that shapes how you deliver client work, and workflow tooling that every person on the team touches all day. In those categories, friction compounds fast enough that building your own is not a vanity project. It is survival.
This post walks through two internal tools we shipped in 2026, why we built each one, and the decision criteria we used to justify the engineering time.
Case One: Multi Agent Orchestration for Client Delivery
The first tool is infrastructure. Over the last eighteen months, an increasing share of our client delivery work has involved autonomous agents: customer support automation, catalog enrichment pipelines, commerce analytics assistants, internal knowledge agents. Each engagement had similar requirements under the surface (deterministic routing, crash recovery, observability, typed graph definitions) but we kept building those primitives fresh on top of different open source frameworks for each project.
The open source agent frameworks we evaluated optimized for research velocity, which is the correct tradeoff for their audience. Rapid iteration on agent graphs, flexible prompt composition, easy swapping of model providers. In a research setting, you want those affordances. In a production setting serving real user traffic, you want boring operational properties: replayable state, graceful failure handling, exhaustive telemetry, and deterministic behavior across model versions.
We built Contra Swarm because the gap between "works in a notebook" and "survives a Tuesday afternoon at scale" is where most agent frameworks live, and we were tired of rebuilding the operational layer for every client. The runtime handles deterministic agent routing, state persistence backed by PostgreSQL, first class OpenTelemetry instrumentation, and replayable execution for debugging failed runs.
The internal justification was specific. We did the math on a single client engagement: three engineers, two weeks of work rebuilding orchestration primitives we had already built twice before. Across the pipeline of engagements we had booked, that recurring cost exceeded the engineering budget required to build a reusable runtime within three quarters. The build decision was not driven by "we can do better than LangGraph." It was driven by "we are paying this cost three times a year regardless, we may as well pay it once."
Open sourcing followed the same logic. If other agencies or product teams face the same production requirements, the runtime is more valuable maintained in public than locked in a private repo where it becomes someone's weekend chore.
Case Two: Account Switching for Multi Tenant Workflows
The second tool is workflow. Agency work means every team member operates inside ten to twenty cloud accounts simultaneously: client Gmail addresses, client Slack workspaces, client Linear workspaces, client GitHub orgs, plus our own. The browser tab situation gets absurd fast, and commercial multi account managers all have the same limitation: they isolate sessions but do not isolate workflows.
We tried everything. Browser profiles (workable, but switching breaks flow). Separate browsers per client (eats RAM, splits extensions). Commercial multi account apps (most isolate poorly and force cross contamination of notifications). The friction was small per interaction and enormous over a workday.
Switchboard is the desktop app we built for this. It isolates browser sessions per account across 25+ services (Gmail, Outlook, Slack, Linear, Discord, and custom service URLs), persists sessions across restarts, aggregates notifications, and lets us switch accounts with keyboard shortcuts instead of tab hunting. The underlying architecture is Electron with Chromium session isolation and SQLite for persistence.
The decision criteria here were different from Contra Swarm. This is not delivery infrastructure. Clients do not see it. There is no revenue story. The justification was pure productivity math: if the average engineer loses fifteen minutes a day to account switching friction, across a team of twelve people over a year that is roughly 750 hours. Building Switchboard cost less than that in engineering time, and the recovered hours compound every year the tool exists.
Unlike Contra Swarm, Switchboard was not designed to be the production backbone of anything. It exists to remove a specific daily annoyance. That is a valid reason to build.
How We Decide to Build Versus Buy
Both of these tools cleared the same decision process, which we have started writing down because it avoids the recurring debate on every new pain point.
First: is the friction getting worse, flat, or improving? Friction that is actively getting worse justifies engineering time. Friction that has plateaued usually gets absorbed without building anything new.
Second: is the problem in your core competency? We are an engineering agency. Building engineering tools is adjacent to what we do. An agency that builds video tools for its workflow is making a harder bet because the failure modes are less familiar.
Third: what is the maintenance tail? This is where most "we should build this" conversations should end. The build cost is visible. The maintenance cost is not. Internal tools accumulate bugs, dependency upgrades, and platform compatibility issues the moment you ship them. If no one is committed to owning the tail, the tool will decay and the original problem will return with interest.
Fourth: does the build create external leverage? Contra Swarm passed this test. We already needed the runtime internally, and open sourcing it created a positioning asset for the agency. Switchboard passed a weaker version of this test: it is not open sourced as a strategic move, but shipping it publicly signals something about how we think about operational tooling.
If a proposed internal tool fails three of these four, we do not build it. If it passes three, we usually do. The two tools described here passed all four.
What This Means for Other Teams
The meta lesson is not "build more of your own tools." Most internal tools are a tax on the engineering team that nobody accounted for at planning time. The lesson is: know which category of friction you are dealing with, and do not confuse "this is annoying" with "this is worth engineering time."
Infrastructure that shapes how you deliver client work is almost always worth building when commercial options force architectural compromises you cannot live with. Workflow tooling that every person touches all day is worth building when the friction compounds faster than the build cost. Everything else should stay on a corporate credit card.
How Contra Collective Approaches This for Clients
We advise client teams through the same decision process when they are weighing a build versus buy question. Usually the answer is buy, because the cost of maintaining internal tooling is chronically underestimated. Sometimes, though, the friction profile justifies the build, and when it does we want clients to go in with clear eyes about what they are committing to. If you are evaluating whether to build internal infrastructure or adopt a commercial platform, book a free technical audit and we will walk through the decision with you.
Final Thoughts
Every agency stack is a long series of tradeoffs between what you buy, what you build, and what you tolerate. The tools worth building are the ones where the friction is structural rather than incidental, where the maintenance cost has a committed owner, and where the leverage compounds over time. Two tools shipped in one year is not a trend. It is a reminder that the default answer should still be "pay the subscription," and the exceptions should earn their engineering time honestly.
More from the Lab
Storyblok vs. Builder.io: Visual Editing for Headless Commerce Teams
Your marketing team does not care about your GraphQL schema. They care about whether they can swap a hero image on a landing page without filing a ticket and waiting three days.
Strapi vs. Payload CMS: Best Open-Source Headless CMS for E-commerce
The case for open-source headless CMS in e-commerce is not ideological. It is economic and operational. SaaS CMS vendors increasingly charge per seat, per locale, per API call, or per content type. At enterprise scale, those numbers compound fast. Self-hosting on your own infrastructure puts cost control back in your hands, but it also means you are choosing a platform you will live with for years.
Vercel vs Cloudflare Pages: Edge Deployment for Commerce in 2026
The edge deployment market looked very different three years ago. Vercel was the obvious choice for teams building on Next.js, and Cloudflare Pages was a static site host trying to grow up. In 2026, that picture has changed substantially. Cloudflare has built a credible full-stack deployment platform with a global edge network, a growing Workers ecosystem, and pricing that makes Vercel's enterprise tier look expensive.