Overstory + IronClaw: Bridging Orchestration and Sandboxing
TL;DR: Overstory orchestrates persistent agent sessions (macro). IronClaw sandboxes discrete tool calls (micro). They're complementary, not competing. The most practical integration is replacing Overstory's brittle shell-script PreToolUse guards with IronClaw's Rust-based validation pipeline — a ~1-2 week project that hardens the weakest link without touching the spawn pipeline.
Overstory + IronClaw: Bridging Orchestration and Sandboxing
Overstory coordinates 25 agents across git worktrees with SQLite mail and a 4-tier merge pipeline. IronClaw sandboxes individual tool calls in WASM with host-boundary credential injection. They solve completely different problems. This post investigates what happens when you try to combine them.
The Key Insight
Overstory and IronClaw operate at different abstraction levels:
| Overstory | IronClaw | |
|---|---|---|
| Abstraction | Persistent agent sessions | Discrete tool calls |
| Isolation | Git worktrees (filesystem) | WASM sandboxes (process) |
| Enforcement | Shell-script hooks (instructions) | Capability tokens (architecture) |
| Security model | Trust + CLAUDE.md constraints | Zero-trust + host-boundary injection |
| Communication | SQLite mail between agents | None (single-agent only) |
The integration surface is at the PreToolUse hook boundary — where Overstory currently runs shell scripts to validate tool calls.
How Overstory Enforces Security Today
All security enforcement is shell-script-based, running as Claude Code PreToolUse hooks deployed by src/agents/hooks-deployer.ts:
Path Boundary Guards (all agents): Validates Write/Edit/NotebookEdit file paths are within OVERSTORY_WORKTREE_PATH
Tool Blocks (non-implementation agents): Blocks Write, Edit, NotebookEdit entirely for scout/reviewer/lead/coordinator/supervisor/monitor
Bash File Guards (non-implementation agents): Regex-checks Bash commands against ~30 dangerous patterns (sed -i, echo >, rm, git push, bun -e, python -c, etc.) with a whitelist of safe prefixes (overstory, bd, git status, mulch, bun test)
Bash Danger Guards (all agents): Blocks git push, git reset --hard, wrong branch naming
Native Team Tool Blocks (all agents): Blocks Claude Code's Task/TeamCreate/SendMessage — forces agents to use overstory sling for delegation
The problem: All of this is sed-based JSON parsing in shell scripts. It works, but it's the brittlest part of the system. IronClaw's Rust validators would be more robust for the same job.
What IronClaw Brings to the Table
IronClaw's security model addresses gaps Overstory doesn't touch:
| Security concern | Overstory's answer | IronClaw's answer |
|---|---|---|
| Agent reads files outside scope | Trust + CLAUDE.md instructions | WASM capability tokens |
| Tool steals credentials | Not addressed | Host-boundary injection |
| Tool exfiltrates data | Not addressed | Network allowlisting + leak detection |
| Prompt injection | Not addressed | 5-layer sanitization |
rm -rf / | Claude Code's built-in safety | WASM sandbox (no FS access) |
Meanwhile, IronClaw has zero answers for agent-to-agent communication, file conflict avoidance, merge pipelines, health monitoring, expertise accumulation, or agent spawning.
Three Integration Approaches
Approach A: IronClaw as Hook Middleware (Recommended)
Replace Overstory's shell-script PreToolUse guards with IronClaw's security pipeline as an external validator.
How it works:
- Overstory continues to spawn agents via tmux + git worktrees
- PreToolUse hooks call
ironclaw validate-toolinstead of inline bash scripts - IronClaw handles: path validation, endpoint allowlisting, secret leak detection
- Overstory's FILE_SCOPE maps to IronClaw's capability allowlist
Mapping:
| Overstory concept | IronClaw equivalent |
|---|---|
| FILE_SCOPE paths | Capability allowlist for workspace-read() |
| PreToolUse shell guards | ironclaw validate-tool subprocess |
OVERSTORY_WORKTREE_PATH | Sandbox root directory |
| Blocked bash patterns | IronClaw command validation |
Pros: Minimal changes to spawn pipeline, gains leak detection and audit logging, replaces brittle sed-based JSON parsing with Rust validators, communication via CLI subprocess (no Rust-TypeScript boundary issues).
Cons: Still no process isolation, ~50-100ms subprocess overhead per tool call, IronClaw requires PostgreSQL (breaks Overstory's zero-dep principle).
Effort: ~1-2 weeks. Modify hooks-deployer.ts to generate hooks calling ironclaw validate-tool --allowlist ... --stdin.
Approach B: Agents in IronClaw Docker Containers
Replace tmux sessions with IronClaw's Docker orchestrator. Each agent runs in a container with network isolation.
How it works:
overstory slingcalls IronClaw orchestrator API instead oftmux new-session- Each agent gets a Docker container with worktree bind-mounted, network access only through IronClaw's HTTP proxy, per-job auth token, reduced Linux capabilities
- Claude Code runs inside container with
--dangerously-skip-permissions - Overstory's overlay generation (CLAUDE.md + hooks) remains the same
| Overstory | IronClaw |
|---|---|
| Git worktree | Container mount (bind-mount worktree) |
| FILE_SCOPE | Filesystem mount permissions |
| Tmux session | Docker container |
| PreToolUse hooks | IronClaw capability tokens |
OVERSTORY_AGENT_NAME env | Job-scoped auth token |
| SQLite mail | Mounted DB or switch to TCP |
Pros: Real process-level isolation, network isolation, credential injection, audit trail.
Cons: Docker dependency, SQLite WAL may not work across container boundaries, Claude Code inside Docker has compatibility friction, coordinating two orchestration systems.
Effort: ~4-8 weeks. Replace tmux.ts with Docker container management via IronClaw orchestrator API.
Approach C: Fork IronClaw, Add Multi-Agent (Not Recommended)
Fork IronClaw and add Overstory's multi-agent orchestration concepts natively in Rust. Agent hierarchy, git worktrees, SQLite mail, merge queue — all reimplemented.
Why not: Massive effort (~3-6 months), Rust vs TypeScript skill mismatch, Claude Code can't run inside WASM (it's a full Node.js CLI), fundamentally different abstraction levels (IronClaw sandboxes tools, not sessions), and PostgreSQL requirement conflicts with zero-dep philosophy.
The Fundamental Architecture Mismatch
The deepest blocker for full integration:
- Overstory orchestrates persistent Claude Code sessions (long-running LLM agents in tmux)
- IronClaw sandboxes discrete tool calls (WASM functions with capability tokens)
Claude Code itself cannot run inside WASM — it's a full Node.js CLI application. IronClaw's WASM sandbox is designed for small, stateless tool functions, not persistent agent sessions.
This means any integration must work at the boundary between these abstraction levels. Approach A does this cleanly by intercepting tool calls at the hook layer. Approach B wraps the entire agent process in Docker. Approach C would require rethinking what "agent" means in IronClaw's architecture.
What's Missing in IronClaw for Multi-Agent
For IronClaw to natively support what Overstory does, it would need:
- Agent hierarchy with depth limits
- Parent-child delegation model
- Inter-agent messaging system
- Agent capability types (scout vs builder vs reviewer)
- Git worktree management
- Merge queue with conflict resolution
- Health monitoring with progressive escalation
- Expertise accumulation across sessions
That's essentially building Overstory inside IronClaw. The better path is keeping them separate and composing at the hook boundary.
Recommendation
Approach A is the right move. It addresses Overstory's weakest link (brittle shell-script guards) while preserving its strengths (zero-dep, tmux-based lifecycle, git worktrees, SQLite mail). The integration surface is clean: PreToolUse hooks delegate to ironclaw validate-tool instead of inline sed/grep scripts.
For trusted development environments, Overstory's current hook-based enforcement is sufficient. For production agent deployments — where tools handle customer data or access production infrastructure — the IronClaw hook middleware adds meaningful security without architectural disruption.
The longer-term play is watching IronClaw's multi-agent roadmap. The moment it ships agent-to-agent messaging and task coordination, a deeper integration (Approach B or a new Approach D) becomes worth revisiting.
Sources
- Overstory source code (v0.5.7)
- IronClaw (GitHub) — Security-first agent runtime
- IronClaw Deep Dive
- IronClaw vs ZeroClaw vs OpenClaw
- The AI Agent Stack