← Back to blog

Overstory + IronClaw: Bridging Orchestration and Sandboxing

·Overstory
overstoryironclawsecurityarchitecturewasm

TL;DR: Overstory orchestrates persistent agent sessions (macro). IronClaw sandboxes discrete tool calls (micro). They're complementary, not competing. The most practical integration is replacing Overstory's brittle shell-script PreToolUse guards with IronClaw's Rust-based validation pipeline — a ~1-2 week project that hardens the weakest link without touching the spawn pipeline.

Overstory + IronClaw: Bridging Orchestration and Sandboxing

Overstory coordinates 25 agents across git worktrees with SQLite mail and a 4-tier merge pipeline. IronClaw sandboxes individual tool calls in WASM with host-boundary credential injection. They solve completely different problems. This post investigates what happens when you try to combine them.


The Key Insight

Overstory and IronClaw operate at different abstraction levels:

OverstoryIronClaw
AbstractionPersistent agent sessionsDiscrete tool calls
IsolationGit worktrees (filesystem)WASM sandboxes (process)
EnforcementShell-script hooks (instructions)Capability tokens (architecture)
Security modelTrust + CLAUDE.md constraintsZero-trust + host-boundary injection
CommunicationSQLite mail between agentsNone (single-agent only)

The integration surface is at the PreToolUse hook boundary — where Overstory currently runs shell scripts to validate tool calls.


How Overstory Enforces Security Today

All security enforcement is shell-script-based, running as Claude Code PreToolUse hooks deployed by src/agents/hooks-deployer.ts:

Path Boundary Guards (all agents): Validates Write/Edit/NotebookEdit file paths are within OVERSTORY_WORKTREE_PATH

Tool Blocks (non-implementation agents): Blocks Write, Edit, NotebookEdit entirely for scout/reviewer/lead/coordinator/supervisor/monitor

Bash File Guards (non-implementation agents): Regex-checks Bash commands against ~30 dangerous patterns (sed -i, echo >, rm, git push, bun -e, python -c, etc.) with a whitelist of safe prefixes (overstory, bd, git status, mulch, bun test)

Bash Danger Guards (all agents): Blocks git push, git reset --hard, wrong branch naming

Native Team Tool Blocks (all agents): Blocks Claude Code's Task/TeamCreate/SendMessage — forces agents to use overstory sling for delegation

The problem: All of this is sed-based JSON parsing in shell scripts. It works, but it's the brittlest part of the system. IronClaw's Rust validators would be more robust for the same job.


What IronClaw Brings to the Table

IronClaw's security model addresses gaps Overstory doesn't touch:

Security concernOverstory's answerIronClaw's answer
Agent reads files outside scopeTrust + CLAUDE.md instructionsWASM capability tokens
Tool steals credentialsNot addressedHost-boundary injection
Tool exfiltrates dataNot addressedNetwork allowlisting + leak detection
Prompt injectionNot addressed5-layer sanitization
rm -rf /Claude Code's built-in safetyWASM sandbox (no FS access)

Meanwhile, IronClaw has zero answers for agent-to-agent communication, file conflict avoidance, merge pipelines, health monitoring, expertise accumulation, or agent spawning.


Three Integration Approaches

Replace Overstory's shell-script PreToolUse guards with IronClaw's security pipeline as an external validator.

How it works:

  1. Overstory continues to spawn agents via tmux + git worktrees
  2. PreToolUse hooks call ironclaw validate-tool instead of inline bash scripts
  3. IronClaw handles: path validation, endpoint allowlisting, secret leak detection
  4. Overstory's FILE_SCOPE maps to IronClaw's capability allowlist

Mapping:

Overstory conceptIronClaw equivalent
FILE_SCOPE pathsCapability allowlist for workspace-read()
PreToolUse shell guardsironclaw validate-tool subprocess
OVERSTORY_WORKTREE_PATHSandbox root directory
Blocked bash patternsIronClaw command validation

Pros: Minimal changes to spawn pipeline, gains leak detection and audit logging, replaces brittle sed-based JSON parsing with Rust validators, communication via CLI subprocess (no Rust-TypeScript boundary issues).

Cons: Still no process isolation, ~50-100ms subprocess overhead per tool call, IronClaw requires PostgreSQL (breaks Overstory's zero-dep principle).

Effort: ~1-2 weeks. Modify hooks-deployer.ts to generate hooks calling ironclaw validate-tool --allowlist ... --stdin.

Approach B: Agents in IronClaw Docker Containers

Replace tmux sessions with IronClaw's Docker orchestrator. Each agent runs in a container with network isolation.

How it works:

  1. overstory sling calls IronClaw orchestrator API instead of tmux new-session
  2. Each agent gets a Docker container with worktree bind-mounted, network access only through IronClaw's HTTP proxy, per-job auth token, reduced Linux capabilities
  3. Claude Code runs inside container with --dangerously-skip-permissions
  4. Overstory's overlay generation (CLAUDE.md + hooks) remains the same
OverstoryIronClaw
Git worktreeContainer mount (bind-mount worktree)
FILE_SCOPEFilesystem mount permissions
Tmux sessionDocker container
PreToolUse hooksIronClaw capability tokens
OVERSTORY_AGENT_NAME envJob-scoped auth token
SQLite mailMounted DB or switch to TCP

Pros: Real process-level isolation, network isolation, credential injection, audit trail.

Cons: Docker dependency, SQLite WAL may not work across container boundaries, Claude Code inside Docker has compatibility friction, coordinating two orchestration systems.

Effort: ~4-8 weeks. Replace tmux.ts with Docker container management via IronClaw orchestrator API.

Fork IronClaw and add Overstory's multi-agent orchestration concepts natively in Rust. Agent hierarchy, git worktrees, SQLite mail, merge queue — all reimplemented.

Why not: Massive effort (~3-6 months), Rust vs TypeScript skill mismatch, Claude Code can't run inside WASM (it's a full Node.js CLI), fundamentally different abstraction levels (IronClaw sandboxes tools, not sessions), and PostgreSQL requirement conflicts with zero-dep philosophy.


The Fundamental Architecture Mismatch

The deepest blocker for full integration:

  • Overstory orchestrates persistent Claude Code sessions (long-running LLM agents in tmux)
  • IronClaw sandboxes discrete tool calls (WASM functions with capability tokens)

Claude Code itself cannot run inside WASM — it's a full Node.js CLI application. IronClaw's WASM sandbox is designed for small, stateless tool functions, not persistent agent sessions.

This means any integration must work at the boundary between these abstraction levels. Approach A does this cleanly by intercepting tool calls at the hook layer. Approach B wraps the entire agent process in Docker. Approach C would require rethinking what "agent" means in IronClaw's architecture.


What's Missing in IronClaw for Multi-Agent

For IronClaw to natively support what Overstory does, it would need:

  • Agent hierarchy with depth limits
  • Parent-child delegation model
  • Inter-agent messaging system
  • Agent capability types (scout vs builder vs reviewer)
  • Git worktree management
  • Merge queue with conflict resolution
  • Health monitoring with progressive escalation
  • Expertise accumulation across sessions

That's essentially building Overstory inside IronClaw. The better path is keeping them separate and composing at the hook boundary.


Recommendation

Approach A is the right move. It addresses Overstory's weakest link (brittle shell-script guards) while preserving its strengths (zero-dep, tmux-based lifecycle, git worktrees, SQLite mail). The integration surface is clean: PreToolUse hooks delegate to ironclaw validate-tool instead of inline sed/grep scripts.

For trusted development environments, Overstory's current hook-based enforcement is sufficient. For production agent deployments — where tools handle customer data or access production infrastructure — the IronClaw hook middleware adds meaningful security without architectural disruption.

The longer-term play is watching IronClaw's multi-agent roadmap. The moment it ships agent-to-agent messaging and task coordination, a deeper integration (Approach B or a new Approach D) becomes worth revisiting.


Sources