name: building-multiagent-systems description: This skill should be used when designing or implementing systems with multiple AI agents that coordinate to accomplish tasks. Triggers on "multi-agent", "orchestrator", "sub-agent", "coordination", "delegation", "parallel agents", "sequential pipeline", "fan-out", "map-reduce", "spawn agents", "agent hierarchy".

Building Multi-Agent, Tool-Using Agentic Systems

Overview

Comprehensive architecture patterns for multi-agent systems where AI agents coordinate to accomplish complex tasks using tools. Language-agnostic and applicable across TypeScript, Python, Go, Rust, and other environments.

Discovery Questions (Required)

Before architecting any system, ask these six mandatory questions:

Starting Point - Greenfield, adding to existing system, or fixing current implementation?
Primary Use Case - Parallel work, sequential pipeline, recursive delegation, peer collaboration, work queues, or other?
Scale Expectations - Small (2-5 agents), medium (10-50), or large (100+)?
State Requirements - Stateless runs, session-based, or persistent across crashes?
Tool Coordination - Independent agents, shared read-only resources, write coordination, or rate-limited APIs?
Existing Constraints - Language, framework, performance needs, compliance requirements?

Foundational Architecture

Four-Layer Stack

Every agent follows the four-layer architecture for testability, safety, and modularity:

Layer	Name	Responsibility
1	Reasoning (LLM)	Plans, critiques, decides which tools to call
2	Orchestration	Validates, routes, enforces policy, spawns sub-agents
3	Tool Bus	Schema validation, tool execution coordination
4	Deterministic Adapters	File I/O, APIs, shell commands, database access

Critical Rule: Everything below Layer 1 must be deterministic. No LLM calls in tools.

See references/four-layer-architecture.md for detailed implementation with code examples.

Foundational Patterns

Pattern	Purpose
Event-Sourcing	All state changes as events for audit trails and replay
Hierarchical IDs	Encode delegation hierarchy (e.g., `session.1.2`) for cost aggregation
Agent State Machines	Explicit states (idle → thinking → tool_execution → stopped) with invalid transition errors
Communication	EventEmitter for state changes, promises for result collection

Seven Coordination Patterns

Choose based on discovery question answers:

Pattern	Use Case	Trade-offs
Fan-Out/Fan-In	Parallel independent work	Fast but costly; watch for orphans
Sequential Pipeline	Multi-stage transformations	Bottleneck at slowest stage
Recursive Delegation	Hierarchical task breakdown	Must add depth limits
Work-Stealing Queue	1000+ tasks with load balancing	No built-in priority
Map-Reduce	Cost optimization	Cheap map ($0.01), smart reduce ($0.15)
Peer Collaboration	LLM council for bias reduction	Expensive (3N+1 calls), slow
MAKER	Zero-error tasks (100K+ steps)	5× cost but ~0% error rate

See references/coordination-patterns.md for detailed implementations.

Pattern Selection Guide

Requirement	Recommended Pattern
Parallel independent tasks	Fan-Out/Fan-In
Each stage depends on previous	Sequential Pipeline
Complex task decomposition	Recursive Delegation
Large batch processing	Work-Stealing Queue
Cost-sensitive analysis	Map-Reduce
Need diverse perspectives	Peer Collaboration
Zero error tolerance	MAKER

MAKER Pattern (Zero Errors)

For tasks requiring 100K+ steps with zero error tolerance (medical, financial, legal domains):

Extreme Decomposition - Recursive breakdown until each subtask <100 steps
Microagents - Single tool, focused expertise, cheap models
Multi-Agent Voting - N parallel attempts per subtask, majority consensus
Error Correction - Deterministic validation + retry with failure context

Cost comparison: Same cost as traditional approach, zero errors vs. 10+ errors.

See references/maker-pattern.md for full implementation with medical diagnosis example.

Tool Coordination

Mechanism	Purpose
Permission Inheritance	Children inherit subset of parent permissions (cannot escalate)
Resource Locking	Acquire/release patterns for shared resources
Rate Limiting	Token bucket algorithm across all agents
Result Caching	Cache read-only, idempotent, expensive operations

Sub-Agent as Tool Pattern: Wrap specialized agents as tools the parent can call, providing composable abstractions and natural lifecycle management.

See references/tool-coordination.md for implementations.

Critical Lifecycle: Cascading Stop

"Always stop children before stopping self." This prevents orphaned agents.

1. Get all child agents
2. Stop all children in parallel
3. Stop self
4. Cancel ongoing work
5. Flush events

If pause/resume unavailable, implement manual checkpointing: save agent state (messages, context, tool results), then restore later.

Production Hardening

Concern	Solution
Orphan Detection	Heartbeat monitoring every 30 seconds
Cost Tracking	Hierarchical aggregation across agent tree
Session Persistence	Project-level task store for cross-session work
Checkpointing	Save after 10+ tools, $1.00 cost, or 5 minutes elapsed
Self-Modification Safety	Blast radius assessment, branch isolation, test-first

See references/production-hardening.md for detailed implementations.

Real-World Example: Code Review System

A pull request orchestrator using Fan-Out/Fan-In:

Spawns four specialist reviewers in parallel (security, performance, style, tests)
Security and tests use smart models (Sonnet); style and performance use fast models (Haiku)
Each reviewer has 2-minute timeout
Results aggregate regardless of partial failures
Costs track per reviewer
All agents stop cleanly via cascading stop after completion

Execution Checklist

When guiding implementation of multi-agent systems:

Ask discovery questions - Understand requirements before architecting
Assess error tolerance - Zero errors → MAKER; some acceptable → simpler patterns
Establish four-layer architecture - Reasoning, orchestration, tool bus, adapters
Design schema-first tools - Typed contracts before implementation
Define deterministic boundary - No LLM in Layers 3-4
Choose orchestration model - YOLO, Safety-First, or Hybrid
Select coordination pattern - Fan-out, pipeline, delegation, queue, map-reduce, peer, or MAKER
Design tool coordination - Permission inheritance, locking, rate limiting
Implement cascading cleanup - Always stop children before parent
Add monitoring and cost tracking - Hierarchical aggregation across agent tree
Consider self-modification safety - If agents can modify code, add safety protocol

Common Pitfalls

Pitfall	Impact
Missing four-layer architecture	Untestable, unsafe, hard to debug
LLM calls in tools (Layer 3-4)	Non-deterministic, can't unit test
No schema-first tool design	Sub-agents can't discover tools
Missing cascading stop	Orphaned agents consuming resources
No permission inheritance	Sub-agents can escalate privileges
No timeouts	Indefinite hangs waiting for sub-agents
Unbounded concurrency	Resource exhaustion from too many agents
Ignoring cost tracking	Budget surprises
No partial-failure handling	One failure cascades to all agents
Unpersisted state	Unrecoverable workflows on crash
Uncoordinated tool access	Race conditions on shared resources
Wrong model selection	Cost inefficiency (Sonnet for simple tasks)
Self-modification without safety	Sub-agents break themselves
No heartbeat monitoring	Can't detect orphans after parent crash

Reference Files

Detailed implementations with code examples:

File	Contents
`references/four-layer-architecture.md`	Four-layer stack, deterministic boundary, schema-first tools
`references/coordination-patterns.md`	Seven coordination patterns with code
`references/maker-pattern.md`	MAKER implementation, voting, medical diagnosis example
`references/tool-coordination.md`	Permission inheritance, locking, rate limiting, caching
`references/production-hardening.md`	Cascading stop, orphan detection, cost tracking, checkpointing

building-multiagent-systems

$ Installer