TL;DR: AI coding tools stack into four layers — LLM providers, coding agents, agent runtimes, and orchestrators — and most discourse conflates them all. Overstory and Gastown solve coordination at the top. Claude Code and Crush solve coding in the middle. IronClaw solves trust at the bottom. They don't compete. They compose.

The AI Agent Stack: From Sandboxes to Swarms

How Overstory, Gastown, Crush, OpenCode, and the Claw Wars map to four distinct layers of the AI coding agent landscape.

Quick Orientation

If you care about...	Look at...
Coordinating 10+ AI agents on one codebase	Overstory or Gastown (Layer 4)
A single AI agent that writes code well	Claude Code, Crush, or OpenCode (Layer 3)
Preventing your agent from stealing your AWS keys	IronClaw (Layer 2)
The model behind the agent	Claude, GPT, Gemini, local models (Layer 1)

These tools don't compete. They stack. Most people conflate them because they all involve "AI writing code," but they're solving fundamentally different problems at different layers. This post maps the landscape.

The Four-Layer Model

┌─────────────────────────────────────────────────────────┐
│  Layer 4: ORCHESTRATION                                 │
│  "Who does what, when, and how does it merge?"          │
│  Overstory, Gastown, Claude Code Agent Teams            │
├─────────────────────────────────────────────────────────┤
│  Layer 3: CODING AGENT                                  │
│  "One agent, one session, write some code"              │
│  Claude Code, Crush, OpenCode, Cursor, Codex CLI        │
├─────────────────────────────────────────────────────────┤
│  Layer 2: AGENT RUNTIME                                 │
│  "Can I trust this tool not to exfiltrate my data?"     │
│  IronClaw, OpenClaw, ZeroClaw                           │
├─────────────────────────────────────────────────────────┤
│  Layer 1: LLM PROVIDER                                  │
│  "Raw intelligence"                                     │
│  Claude, GPT, Gemini, Llama, Ollama                     │
└─────────────────────────────────────────────────────────┘

Most discourse treats everything in this stack as "AI coding tools" and tries to compare them head-to-head. That's like comparing Kubernetes to React because they both "run JavaScript." Let's look at what each layer actually does.

Layer 4: The Orchestrators

This is where things get interesting. You have one codebase, ten agents, and they all need to write code without destroying each other's work. Two systems dominate this space today.

Overstory

Built by: jayminwest | Language: TypeScript/Bun | Deps: Zero runtime npm dependencies

Overstory turns your Claude Code session into the orchestrator. There's no separate daemon. Your session IS the brain. It spawns workers via tmux into isolated git worktrees and coordinates them through a custom SQLite mail system.

Architecture:

Your Claude Code Session (orchestrator)
  ├── overstory sling lead → Team Lead (tmux + worktree)
  │     ├── overstory sling builder → Builder (tmux + worktree)
  │     ├── overstory sling scout → Scout (tmux + worktree, read-only)
  │     └── overstory sling reviewer → Reviewer (tmux + worktree, read-only)
  └── overstory sling lead → Another Team Lead
        └── ...

What makes it tick:

Component	Implementation
Agent spawning	`overstory sling` creates worktree + tmux + CLAUDE.md overlay
Messaging	Custom SQLite mail (WAL mode, ~1-5ms queries, typed messages)
Isolation	Git worktrees — each agent gets its own branch and directory
Merge pipeline	4-tier: clean merge → auto-resolve → AI-resolve → reimagine
Health monitoring	3-tier watchdog: mechanical daemon → AI triage → monitor agent
Expertise	Mulch integration — agents accumulate domain knowledge across sessions
Observability	events.db + trace + replay + feed + dashboard + inspect + costs
Hierarchy	Depth-limited tree (coordinator → lead → worker), max depth 2

The two-layer instruction model is clever: each agent gets a base definition (the HOW — agents/builder.md describes what a builder does) plus a per-task overlay CLAUDE.md (the WHAT — task ID, file scope, spec path, branch name). The orchestrator only passes WHAT. The base definition already has HOW.

The merge pipeline is the most sophisticated in any orchestrator I've seen. Four escalation tiers:

Clean merge: git merge --no-ff, zero conflicts. Done.
Auto-resolve: Parse conflict markers, keep the agent's changes.
AI-resolve: Claude analyzes the conflicts with mulch history context and generates merged content.
Reimagine: Nuclear option — abort the merge, replay the agent's changes from scratch on a fresh canonical state.

If tier N fails, it escalates to tier N+1. Conflict history from mulch informs which tiers to skip (if this file always fails at tier 2, start at tier 3).

Gastown (Steve Yegge)

Built by: Steve Yegge | Language: Go | Data: Dolt (git-backed database)

Gastown uses theatrical metaphors for the same fundamental primitives:

Gastown term	Overstory equivalent	What it is
Mayor	Orchestrator session	Your primary Claude Code with full context
Polecats	Builder/Scout/Reviewer	Worker agents with persistent identity
Rigs	`.overstory/` per-project	Project containers wrapping git repos
Hooks	Git worktrees	Persistent storage that survives crashes
Convoys	Groups	Bundles of work items assigned to agents
Beads	Beads	Issue tracking (both use `bd`)

Shared DNA: Both systems use beads (bd) for issue tracking, tmux for agent sessions, git worktrees for isolation, and CLAUDE.md hooks for agent instructions. They diverged on storage (SQLite vs Dolt), language (TypeScript vs Go), and orchestration philosophy.

Key differences:

	Overstory	Gastown
Hierarchy	Explicit depth-limited tree	Flatter (Mayor → Polecat)
Merge	4-tier escalation pipeline	Git-based, less formalized
Expertise	Mulch integration (persistent domain knowledge)	Not mentioned
Observability	7+ query tools (trace, replay, feed, etc.)	Beads-based tracking
Multi-runtime	Claude Code only	Claude Code + Codex
Scale target	Structured specialist teams	20-30 agents comfortably
Runtime deps	Zero	Go modules

The relationship: They're siblings, not competitors. Same parents (beads, worktrees, tmux, CLAUDE.md), different upbringings. Gastown is broader and flatter (multi-runtime, more agents); Overstory is deeper and more structured (typed hierarchy, 4-tier merge, integrated expertise, richer observability).

Both represent what Paddo's blog calls "operational multi-agent" — agents with external state management and git worktrees for isolation, as opposed to BMAD-style "SDLC theater" that recreates human organizational bottlenecks with sequential persona handoffs.

Layer 3: The Coding Agents

These are the individual agents that actually write code. One session, one agent, one conversation. The orchestrators in Layer 4 spawn many of these.

Claude Code (Anthropic)

The 800-pound gorilla. Terminal-based, Claude-only, with an experimental built-in Agent Teams feature that's disabled by default. Agent Teams uses Claude Code's own Task tool to spawn subagents — simpler than Overstory/Gastown but without external state persistence, worktree isolation, or merge pipelines. If a session crashes, the coordination state dies with it.

Crush (Charm)

Language: Go | TUI: Bubble Tea | Provider abstraction: Fantasy library

Built by the Charm team (Bubble Tea, Lip Gloss, Glamour) after they recruited the original OpenCode (Go) creator. The coordinator pattern is the architectural core: a centralized hub managing per-session FIFO queues, routing prompts to agents with injected dependencies, handling OAuth refresh, and coordinating three-layer config merging.

Crush has a sub-agent architecture where coder agents can spawn isolated read-only task agents. This is not multi-agent orchestration — it's delegation within a single session. No agent-to-agent messaging, no shared task queues, no merge pipelines.

What Crush does well: LSP integration, MCP server support (stdio/HTTP/SSE), multi-provider model switching, and the most polished terminal UX in the space (it's Charm, after all).

OpenCode (Anomaly Innovations)

Two lives: The archived Go version (Bubble Tea TUI, monolithic) and the active TypeScript rewrite (client/server, Hono API, SolidJS TUI at 60fps, Bun runtime).

The TypeScript version is a full client/server architecture — HTTP API with SSE for real-time updates, SQLite + Drizzle ORM for persistence, Vercel AI SDK for provider abstraction (75+ models). It has a Coder agent, Task agent, and Title agent, but these are isolated sub-agents within a single session. No inter-agent communication or coordination.

The OpenCode → Crush pipeline: The original Go OpenCode creator joined Charm and built Crush. The TypeScript rewrite is maintained by a different team (Anomaly Innovations). So OpenCode and Crush share architectural philosophy but diverged on language and maintainership.

How They Relate to Layer 4

Simple: Layer 3 agents are the worker processes that Layer 4 orchestrators spawn and coordinate. When Overstory runs overstory sling builder, it creates a tmux session running Claude Code. If Gastown supported Crush as a runtime, it would create a tmux session running Crush instead.

Layer 3 tools don't know they're being orchestrated. They just see a CLAUDE.md with instructions and get to work.

Layer 2: The Agent Runtimes (The Claw Wars)

This layer asks a different question entirely: not "how do agents coordinate?" but "can I trust what this agent's tools are doing?"

When Claude Code runs a bash command or writes a file, what prevents it from reading your .env, exfiltrating credentials to an external server, or rm -rf-ing your home directory? Claude Code has built-in safety checks and user confirmation prompts. The Claws argue that's not enough.

IronClaw (NEAR AI / Llion Jones)

Language: Rust | Binary: 3.4MB | Startup: <10ms | Idle RAM: ~7.8MB

The security-first contender. Every tool runs in an isolated WebAssembly sandbox with capability-based permissions inspired by the seL4 microkernel.

The killer feature: host-boundary credential injection.

Traditional approach:
  Tool receives API key → Tool makes request → Hope tool doesn't leak it

IronClaw approach:
  Tool requests HTTP (no auth) → Host injects credentials at boundary →
  Leak detection scans I/O → Tool receives sanitized response

Tools never possess credentials. They can't leak what they don't have. The host (IronClaw runtime) injects secrets into outbound requests and strips them from responses. Aho-Corasick pattern matching catches 15+ credential formats.

WASM sandbox restrictions:

No environment variable access
No arbitrary filesystem access
No direct network sockets
No credential vault access
No process spawning

Tools must hold explicit capability tokens for each permitted action. Rate limiting, memory limits, CPU limits, execution time limits — all enforced at the sandbox boundary.

The Claw Landscape

Per the head-to-head comparison:

	OpenClaw	IronClaw	ZeroClaw
Stars	216K	Small (new)	Small (new)
Language	Python	Rust	Go
Security	Weak (plaintext creds, 24 vulns)	Strong (WASM sandbox, host-boundary injection)	Medium (encrypted creds, same-process tools)
Multi-agent	Yes (only Claw with orchestration)	No	No
Binary size	28MB+	3.4MB	8.8MB
Idle RAM	394MB	~7.8MB	~12MB
Channels	~15	Growing	23
Providers	~8	MCP-based	30+

The paradox: OpenClaw is the only Claw with multi-agent orchestration, but it has the worst security. IronClaw has the best security, but no multi-agent. You can't have both yet.

Recommended path: Harden OpenClaw now for orchestration capabilities, monitor IronClaw's multi-agent roadmap (~2,200 lines of code away per estimates), migrate when it's ready.

Why This Layer is Orthogonal to Overstory

Overstory enforces agent boundaries through instructions: "CLAUDE.md says only touch these files in your FILE_SCOPE." IronClaw enforces them through architecture: WASM physically prevents unauthorized file access.

Security concern	Overstory's answer	IronClaw's answer
Agent reads files outside scope	Trust + instructions	WASM capability tokens
Tool steals credentials	Not addressed	Host-boundary injection
Tool exfiltrates data	Not addressed	Network allowlisting + leak detection
Prompt injection	Not addressed	5-layer sanitization
`rm -rf /`	Claude Code's built-in safety	WASM sandbox (no FS access)

Meanwhile, IronClaw has zero answers for:

Agent-to-agent communication
File conflict avoidance across agents
Merge pipelines
Agent health monitoring
Expertise accumulation
Agent spawning and scaling

They solve completely different problems. Overstory is urban planning; IronClaw is building codes. You need both for a city, but they're designed by different people for different reasons.

The Dream Stack

Nobody has built this yet, but the layers compose naturally:

┌─────────────────────────────────────────────┐
│  Overstory / Gastown                        │
│  "10 agents, exclusive file scopes,         │
│   SQLite mail, 4-tier merge"                │
├─────────────────────────────────────────────┤
│  IronClaw                                   │
│  "Each agent's tools run in WASM sandboxes, │
│   credentials injected at boundary"         │
├─────────────────────────────────────────────┤
│  Claude Code / Crush / OpenCode             │
│  "Prompt → reason → tool call → code"       │
├─────────────────────────────────────────────┤
│  Claude / GPT / Gemini / Local              │
│  "Raw inference"                            │
└─────────────────────────────────────────────┘

Overstory tells agent-3 to build the auth module. IronClaw ensures agent-3's tools can only access src/auth/ and can't leak the database password. Claude Code handles the reasoning loop. Claude provides the intelligence.

Today, Overstory skips Layer 2 entirely and trusts Claude Code's built-in safety. That works for trusted development environments. For production agent deployments — where tools run code from untrusted sources, handle customer data, or access production infrastructure — the Claw Wars matter a lot more.

What I'd Watch

Overstory's merge pipeline is the most underrated innovation in this space. Everyone talks about spawning agents. Nobody talks about what happens when 10 agents' branches need to merge. The 4-tier escalation with mulch-informed conflict history is genuinely novel.
IronClaw adding multi-agent. The moment IronClaw ships agent-to-agent messaging and task coordination, it becomes the obvious Layer 2 choice for any orchestrator.
Claude Code Agent Teams maturing. If Anthropic builds worktree isolation, external state persistence, and merge pipelines into the native Agent Teams feature, the case for external orchestrators weakens significantly.
Crush or OpenCode as Overstory/Gastown runtimes. Gastown already supports Codex as an alternative runtime. If Crush or OpenCode become viable agent runtimes for orchestrators, you get multi-provider model flexibility at the agent level.

Sources

Overstory (GitHub) — Multi-agent orchestration for Claude Code
Gastown (GitHub) — Multi-agent workspace manager
GasTown and the Two Kinds of Multi-Agent
OpenCode (GitHub) — Open-source AI coding agent
Crush (GitHub) — Glamourous AI coding agent
IronClaw (GitHub) — Security-first agent runtime
Claude Code Agent Teams Docs