superset-sh · saddlepaddle · Feb 3, 2026 · Feb 2, 2026 · Feb 3, 2026 · Feb 3, 2026
diff --git a/apps/marketing/content/blog/roadmap-to-100-agents.mdx b/apps/marketing/content/blog/roadmap-to-100-agents.mdx
@@ -0,0 +1,150 @@
+---
+title: "Our plan for running 100 Parallel Coding Agents"
+description: "An attempt to crystallize our plans for 2026"
+author: satya
+date: 2026-02-02
+category: Product
+---
+
+![Superset — managing coding agents in parallel](/blog/roadmap-to-100-agents/cover.png)
+
+Right now at Superset, we're able to reliably manage 5-7 coding agents in parallel - whether that's Claude Code, Codex,
+etc. - at a time. Our goal is to be able to manage 100 coding agents in parallel each by the end of 2026.
+
+Most people believe that the path from seven to 100 agents is better models, faster inference, and smarter agents. It's
+not. Agent compute is already cheap enough, you can run hundreds of agents a month all for less than the cost of one
+engineer.
+
+What's stopping us is every agent needs a human to review its code, give feedback, and decide what to work on next.
+Scale the agents all you want - it's the humans that don't scale.
-Right now at Superset, we're able to reliably manage 5-7 coding agents in parallel - whether that's Claude Code, Codex,
-etc. - at a time. Our goal is to be able to manage 100 coding agents in parallel each by the end of 2026.
-
-Most people believe that the path from seven to 100 agents is better models, faster inference, and smarter agents. It's
-not. Agent compute is already cheap enough, you can run hundreds of agents a month all for less than the cost of one
-engineer.
-
-What's stopping us is every agent needs a human to review its code, give feedback, and decide what to work on next.
-Scale the agents all you want - it's the humans that don't scale.
+Right now at Superset, we're able to reliably manage 5-7 coding agents in parallel - whether that's Claude Code, Codex,
+etc. - at a time. Our goal is to be able to manage 100 coding agents in parallel by the end of 2026.
+
+Most people believe that the path from seven to 100 agents is better models, faster inference, and smarter agents. It's
+not. Agent compute is already cheap enough, you can run hundreds of agents a month all for less than the cost of one
+engineer.
+
+What's stopping us is every agent needs a human to review its code, give feedback, and decide what to work on next.
+Scale the agents all you want - it's the humans who don't scale.
-Right now at Superset, we're able to reliably manage 5-7 coding agents in parallel - whether that's Claude Code, Codex,
-etc. - at a time. Our goal is to be able to manage 100 coding agents in parallel each by the end of 2026.
-
-Most people believe that the path from seven to 100 agents is better models, faster inference, and smarter agents. It's
-not. Agent compute is already cheap enough, you can run hundreds of agents a month all for less than the cost of one
-engineer.
-
-What's stopping us is every agent needs a human to review its code, give feedback, and decide what to work on next.
-Scale the agents all you want - it's the humans that don't scale.
+Right now at Superset, we're able to reliably manage 5-7 coding agents in parallel - whether that's Claude Code, Codex,
+etc. - at a time. Our goal is to be able to manage 100 coding agents in parallel by the end of 2026.
+
+Most people believe that the path from seven to 100 agents is better models, faster inference, and smarter agents. It's
+not. Agent compute is already cheap enough, you can run hundreds of agents a month all for less than the cost of one
+engineer.
+
+What's stopping us is every agent needs a human to review its code, give feedback, and decide what to work on next.
+Scale the agents all you want - it's the humans who don't scale.
+
+# Mapping out the problem
+
+You can imagine the agent loop as a pipeline, and the goal is to improve throughput:
+
+![The agent pipeline — most steps require a human](/blog/roadmap-to-100-agents/pipeline-diagram.svg)
+
+There's a clear bottleneck that emerges when you look at this. A human is involved in almost every step, and each of
+these steps has a steep context-switching cost - you have to open that agent's code, spin up dev servers, click through
+the UI to verify their work, give feedback and more. Right now, most of our agents spend more time waiting for us to 
+review their work than they spend doing it.
+
+At 100 agents, this model completely breaks. You can't review 100 diffs a day. You can't context-switch between 100
+streams of work.
+
+The fix is straightforward: pull the human out of steps where they're not needed, and make the remaining steps faster.
+
+# How we'll improve it
+
+## Have agents work harder before reaching out to you
+
+If you've worked with a coding agent, you've had the experience: the agent comes back with something half-baked, you
+spend 15 minutes catching up to what it did, spin up a dev server, click around, then feed it the same feedback you've
+given a dozen agents before. Most of the time you spend reviewing isn't making decisions — it's catching problems that
+should have been caught before the work reached you.
+
+The fix is adding layers between the agent and you. The agent's work should be vetted thoroughly before it ever is
+presented to you.
+
+![Agent work passes through review layers before reaching you](/blog/roadmap-to-100-agents/quality-gates.svg)
+
+### Adversarial agents
+
+[Block published a paper recently](https://block.xyz/documents/adversarial-cooperation-in-code-synthesis.pdf) that
+highlights how useful having agents work together can be. The general idea is that they send two agents on tasks, one to
+implement the task at hand, and the other to enforce the implementer to write tests, review their work, and do due
+diligence before picking a solution.
+
+A similar pattern can be used to reduce interruptions for you: you could have a dedicated bouncer agent that sits
+between the coding agents and you, preventing agents from surfacing its work until it's sure the agent is either done
+with its work or is sufficiently stuck. Your review becomes a final sign-off, not a first pass.
+
+### Stacking review agents and automated testing
+
+Since you don't care how long an agent takes when you're running dozens of them, there's no downside to stacking checks.
+Run five different review agents, each looking for different classes of issues, with a final agent consolidating the
+feedback. Each layer increases the odds that problems are found and resolved before you ever see the code.
+
+The same logic applies to testing. Giving agents access to the browser through tools like
+[BrowserUse](https://browser-use.com/) or [Maestro](https://maestro.mobile.dev/) tests lets them verify their own work
+visually — catching UI regressions, layout issues, and interaction bugs that are invisible in code review alone.
+
+### Long-running agents
+
+Most agent workflows today are one-shot: you give a task, the agent works, it comes back. But agents should be able to
+run longer loops — trying an approach, hitting an issue, adjusting, and iterating until they're confident or genuinely
+stuck. [Ralph loops](https://ghuntley.com/loop/) are a popular pattern for this: treat the agent's work as clay on a
+wheel, refining iteratively, rather than laying bricks in a line.
+
+The result is fewer interruptions and higher-quality output when the agent does surface. An agent that's been iterating
+for an hour and is confident in its solution is far easier to review than one that gave up after its first attempt.
+
+## Make it fast to review agents' work
+
+Most developer tools today are human-driven — you open a diff, you spin up a dev server, you navigate to the right page.
+Agents plug into these tools, but the human is still doing the legwork. We want to shift the paradigm towards
+agent-driven UIs - interfaces that agents orchestrate for the human's benefit, where each review takes seconds, not
+minutes.
+
+### Investing in agent-driven UIs
+
+When you review an agent's work today, you're dropped into a diff with no context. You have to reconstruct what the
+agent was trying to do, spin up an environment to test it, and navigate to the right pages to verify. That's the agent
+dumping its work on your desk.
+
+In an agent-driven UI, the agent prepares your review for you. It writes a summary of what changed and why, spins up a
+preview environment, navigates you to the specific pages or flows it wants you to look at, and surfaces the test results
+that matter. When you open a completed task, you should be looking at a prepared briefing, not raw output.
+
+### Make existing tools better
+
+PR reviews, CI dashboards, IDEs — these are all built for a world where humans drive the interactions. In an agent-first
+world, the tools need to meet you differently. Agents should be annotating their own PRs before you open them, the way
+[Devin's review](https://app.devin.ai/review) adds context to diffs ahead of time. CI results should be summarized and
+triaged by an agent, not presented as a raw log for you to parse. The tools we use every day were designed for human
+authors — adapting them for human reviewers of agent work is a different design problem.
+
+### Reducing friction to zero
+
+Every interaction between you and an agent should be as lightweight as possible. You should be able to click yes or no
+for straightforward changes. Agents should prep multiple-choice questions — "I found three approaches to this, which do
+you prefer?" — so you're choosing instead of typing. When an agent does need written feedback, supporting agents can
+prefill a draft response based on the context, so you're editing instead of writing from scratch. Quick actions like
+"create PR" or "deploy to staging" should also be easy to reach.
+
+The goal isn't just faster review — it's making the interaction so lightweight that you can do it from your phone
+between meetings.
+
+## Have agents be more proactive
+
+![Events trigger agents automatically](/blog/roadmap-to-100-agents/proactive-agents.svg)
+
+Everything above assumes you're the one deciding what agents work on. But at 100 agents, planning is itself a
+bottleneck. You can't spec out 100 tasks a day — that requires understanding the codebase, the product priorities, and
+the nuances of each task.
+
+### Reusable workflows
+
+The building blocks for this are already emerging. [OpenAI's Codex skills](https://developers.openai.com/codex/skills/)
+let you package repeatable workflows — deploy procedures, migration steps, test patterns — as reusable bundles that
+agents can invoke on their own when the situation matches. Instead of writing the same instructions every time, you
+encode them once and the agent recognizes when to apply them.
+
+### Event-driven triggers
+
+[Devin's workflows](https://devin.ai/) take this further with event-driven triggers. A build fails, and a Devin instance
+spins up to investigate. A Linear ticket is created, and an agent starts working on it automatically. Teams create
+playbooks for recurring tasks — setting up changelogs, running code migrations, adding test coverage — that agents
+execute on a schedule or in response to events without anyone initiating them.
+
+### Beyond code
+
+Even outside of code, this pattern is taking hold. [Circleback](https://circleback.ai/) listens to your meetings and
+doesn't just take notes — it extracts action items, creates Linear tickets for feature requests mentioned in product
+demos, and updates your CRM after sales calls. The meeting ends and the downstream work is already in motion.
+
+We don't have all of this figured out yet. Some of it is live, some is on our roadmap, and some is still taking shape.
+But the throughput framing gives us a clear test for every feature we build: does this reduce the time a human spends
+per agent interaction? 
+
+If you're running agents at scale and hitting these walls, we'd love to compare notes, reach out to us at founders@superset.sh
-If you're running agents at scale and hitting these walls, we'd love to compare notes, reach out to us at founders@superset.sh
+If you're running agents at scale and hitting these walls, we'd love to compare notes. Reach out to us at [founders@superset.sh](mailto:founders@superset.sh).
-If you're running agents at scale and hitting these walls, we'd love to compare notes, reach out to us at founders@superset.sh
+If you're running agents at scale and hitting these walls, we'd love to compare notes. Reach out to us at [founders@superset.sh](mailto:founders@superset.sh).
diff --git a/apps/marketing/public/blog/roadmap-to-100-agents/cover.png b/apps/marketing/public/blog/roadmap-to-100-agents/cover.png
diff --git a/apps/marketing/public/blog/roadmap-to-100-agents/pipeline-diagram.svg b/apps/marketing/public/blog/roadmap-to-100-agents/pipeline-diagram.svg
diff --git a/apps/marketing/public/blog/roadmap-to-100-agents/proactive-agents.svg b/apps/marketing/public/blog/roadmap-to-100-agents/proactive-agents.svg
diff --git a/apps/marketing/public/blog/roadmap-to-100-agents/quality-gates.svg b/apps/marketing/public/blog/roadmap-to-100-agents/quality-gates.svg