You Don't Actually Need Multi-Agents

A herd of cattle grazing across open rangeland under a wide sky

Every few weeks a new "multi-agent" product shows up on my timeline. Named bots. Custom personas. A group chat UI where you can @ different agents and watch them talk to each other. The demos always look charming for about ten minutes.

Then nobody opens them again.

I've been trying to figure out why this keeps happening, and I think the answer is older than the AI boom. It's the same mistake the cloud industry made in 2010, and the same correction is about to happen again — much faster this time, because the physics are even more obvious.

To be clear, I'm talking about agents in the productivity lane. Companion AI is its own thing and most of what follows doesn't apply to it.

A detour through 2012

Around 2012, ops culture went through a quiet revolution. The shorthand was pets vs. cattle, popularized by Randy Bias after Bill Baker first drew the distinction.

Pets had names. db-prod-01. mail-server. jenkins-master. Each one was hand-configured, slightly different from its neighbors, and irreplaceable. When a pet got sick, someone SSH'd in at 2 a.m. and nursed it back to health. You'd run a single machine for five years and pray.

Cattle had numbers. web-7a3f, worker-9821. All of them booted from the same image. When one misbehaved, you didn't fix it — you killed it and let the scheduler start a new one. You scaled by adding more, not by upgrading any single instance.

The shift wasn't really about ops technique. It was about where trust lived. In the pets era, trust lived inside individual machines: "this server has been good to us." In the cattle era, trust moved to the system that produces machines. The instance became disposable because the process was reliable.

This is the move that made Kubernetes, Terraform, Lambda, and every modern cloud abstraction possible.

The agent industry is in its pets era

Look at how most agent products are designed today.

Agents have names and faces.
They have persistent personas you can tune.
They have "memory" — and that memory is treated as part of the agent's identity, not the user's.
When you want two of them to collaborate, you put them in a chat thread.
When one of them does something useful, you save it. You build a relationship with it.

This feels natural because we're modeling agents on humans, and humans are pets by physics. A person can't be copied. Context-switching them is expensive. Trust takes years to build. So we name them, give them roles, and accept that "only Sarah can really handle this account."

But agents have none of those constraints. They can be spawned in a second. They can be cloned a thousand times in parallel. They can be destroyed with no severance package. The single biggest physical fact about an LLM-based agent is that it's free to create and free to throw away.

If your product design doesn't take advantage of this, you've imported a constraint that doesn't exist. You're raising cattle as if they were pets, and then wondering why the barn won't scale.

Cattle-mode agents

Here's what the alternative looks like, concretely. The mental model I keep coming back to:

One task → one agent → done → destroyed.

State doesn't live inside the agent. State lives in a shared workspace — the project, the repo, the user's files, the database, whatever the persistent substrate of the work is. The agent is a pure function over a snapshot of that workspace. It reads what it needs, does its job, writes a delta back, and dies.

So instead of "ask Sarah-the-research-bot to look something up," the system spawns a ResearchAgent bound to a specific query and a set of tools. It runs. It writes findings to the workspace. It's gone. The next task spawns a fresh one, possibly in parallel with five others doing different things.

You don't talk to the agent. You talk to the workspace. The agent is just whatever the scheduler happened to instantiate to satisfy your last request.

This sounds like a small reframing, but in practice it changes almost every design decision downstream.

Why this is strictly stronger

Once you commit to disposable, workspace-mediated agents, a lot of hard problems get easier:

Replayable. Same workspace snapshot plus same task spec produces the same trajectory. You can debug agents the way you debug pure functions, not the way you debug a flaky coworker.
Parallel by default. There's no worker affinity. Fan out N copies. Run a search agent and a code-mod agent and an eval agent against the same workspace at the same time, and merge their deltas at the end.
Swappable. Today's agent is Claude. Tomorrow's is Gemini. Next year's is something you wrote yourself out of smaller models and tool chains. The workspace schema doesn't notice.
Honest about failure. A broken agent gets killed. You don't have to argue with it. You don't have to retrain it. You don't have to maintain its self-esteem.

Compare this to the group-chat model, where a "buggy" agent has accumulated state, history, and reputation, and ripping it out feels like firing a teammate. That's a sign your architecture is doing emotional labor it shouldn't be doing.

The moat isn't the agent

Here's the part I think is most often missed.

In the cattle era of cloud, the valuable companies were not the ones that made the best individual server. They were the ones that built the systems around servers — Kubernetes, Terraform, the whole orchestration and infrastructure-as-code layer. Once you stopped naming machines, the machine itself became a commodity, and the value migrated to the substrate.

The same migration is starting to happen for agents. The interesting design surface — the part that's actually hard, defensible, and not commoditized — is not the prompt or the persona. It's:

Workspace schema. What state is persisted? At what granularity? How is it structured so multiple agents can read and write without corrupting each other?
Task scheduling. When does a new task spawn an agent? Which agent? With what tools? When does it get killed?
Concurrency and conflict resolution. Two agents both want to edit the same artifact. What happens?
Garbage collection. Workspaces fill up with intermediate junk. What's worth keeping? What gets pruned?

These are unglamorous questions. They don't make for good demos. But they're the equivalent of the questions K8s answered for servers, and whoever answers them well for agents gets the position that AWS and HashiCorp got in the previous cycle.

Most "AI moats" being pitched right now are still in the pets layer. Better personas. Smarter memory. Cuter avatars. Meanwhile the cattle ranchers haven't really shown up yet, and the ground floor is wide open.

You probably don't need multi-agents

So when someone asks "should we build multi-agent collaboration into our product?" — the honest answer is usually no. Not because the underlying capability is wrong, but because the question is framed in the pets vocabulary. It assumes agents are entities with identity, and that "collaboration" means putting two of them in a room.

What you almost always actually want is:

A workspace that holds your real state.
A way to describe tasks against that workspace.
A scheduler that instantiates short-lived agents to satisfy those tasks.

If you build those three things, "multi-agent" emerges for free. You'll run dozens of agents a day without naming any of them. They'll never meet each other. They won't need to.

The product surface your user sees isn't a chat with bots. It's their workspace, getting steadily more useful, while invisible cattle pass through it doing work.

Closing thought

It took the cloud industry the better part of a decade to internalize that servers are consumables. Agents should be a faster transition, because the physics are even more obviously on the disposable side — but only for the teams that notice early.

Don't name your agents. Name your workspace.