Marketplace

building-multiagent-systems

This skill should be used when designing or implementing systems with multiple AI agents that coordinate to accomplish tasks. Triggers on "multi-agent", "orchestrator", "sub-agent", "coordination", "delegation", "parallel agents", "sequential pipeline", "fan-out", "map-reduce", "spawn agents", "agent hierarchy".

$ Installer

git clone https://github.com/2389-research/claude-plugins /tmp/claude-plugins && cp -r /tmp/claude-plugins/building-multiagent-systems/skills ~/.claude/skills/claude-plugins

// tip: Run this command in your terminal to install the skill


name: building-multiagent-systems description: This skill should be used when designing or implementing systems with multiple AI agents that coordinate to accomplish tasks. Triggers on "multi-agent", "orchestrator", "sub-agent", "coordination", "delegation", "parallel agents", "sequential pipeline", "fan-out", "map-reduce", "spawn agents", "agent hierarchy".

Building Multi-Agent, Tool-Using Agentic Systems

Overview

Comprehensive architecture patterns for multi-agent systems where AI agents coordinate to accomplish complex tasks using tools. Language-agnostic and applicable across TypeScript, Python, Go, Rust, and other environments.

Discovery Questions (Required)

Before architecting any system, ask these six mandatory questions:

  1. Starting Point - Greenfield, adding to existing system, or fixing current implementation?
  2. Primary Use Case - Parallel work, sequential pipeline, recursive delegation, peer collaboration, work queues, or other?
  3. Scale Expectations - Small (2-5 agents), medium (10-50), or large (100+)?
  4. State Requirements - Stateless runs, session-based, or persistent across crashes?
  5. Tool Coordination - Independent agents, shared read-only resources, write coordination, or rate-limited APIs?
  6. Existing Constraints - Language, framework, performance needs, compliance requirements?

Foundational Architecture

Four-Layer Stack

Every agent follows the four-layer architecture for testability, safety, and modularity:

LayerNameResponsibility
1Reasoning (LLM)Plans, critiques, decides which tools to call
2OrchestrationValidates, routes, enforces policy, spawns sub-agents
3Tool BusSchema validation, tool execution coordination
4Deterministic AdaptersFile I/O, APIs, shell commands, database access

Critical Rule: Everything below Layer 1 must be deterministic. No LLM calls in tools.

See references/four-layer-architecture.md for detailed implementation with code examples.

Foundational Patterns

PatternPurpose
Event-SourcingAll state changes as events for audit trails and replay
Hierarchical IDsEncode delegation hierarchy (e.g., session.1.2) for cost aggregation
Agent State MachinesExplicit states (idle → thinking → tool_execution → stopped) with invalid transition errors
CommunicationEventEmitter for state changes, promises for result collection

Seven Coordination Patterns

Choose based on discovery question answers:

PatternUse CaseTrade-offs
Fan-Out/Fan-InParallel independent workFast but costly; watch for orphans
Sequential PipelineMulti-stage transformationsBottleneck at slowest stage
Recursive DelegationHierarchical task breakdownMust add depth limits
Work-Stealing Queue1000+ tasks with load balancingNo built-in priority
Map-ReduceCost optimizationCheap map ($0.01), smart reduce ($0.15)
Peer CollaborationLLM council for bias reductionExpensive (3N+1 calls), slow
MAKERZero-error tasks (100K+ steps)5× cost but ~0% error rate

See references/coordination-patterns.md for detailed implementations.

Pattern Selection Guide

RequirementRecommended Pattern
Parallel independent tasksFan-Out/Fan-In
Each stage depends on previousSequential Pipeline
Complex task decompositionRecursive Delegation
Large batch processingWork-Stealing Queue
Cost-sensitive analysisMap-Reduce
Need diverse perspectivesPeer Collaboration
Zero error toleranceMAKER

MAKER Pattern (Zero Errors)

For tasks requiring 100K+ steps with zero error tolerance (medical, financial, legal domains):

  1. Extreme Decomposition - Recursive breakdown until each subtask <100 steps
  2. Microagents - Single tool, focused expertise, cheap models
  3. Multi-Agent Voting - N parallel attempts per subtask, majority consensus
  4. Error Correction - Deterministic validation + retry with failure context

Cost comparison: Same cost as traditional approach, zero errors vs. 10+ errors.

See references/maker-pattern.md for full implementation with medical diagnosis example.

Tool Coordination

MechanismPurpose
Permission InheritanceChildren inherit subset of parent permissions (cannot escalate)
Resource LockingAcquire/release patterns for shared resources
Rate LimitingToken bucket algorithm across all agents
Result CachingCache read-only, idempotent, expensive operations

Sub-Agent as Tool Pattern: Wrap specialized agents as tools the parent can call, providing composable abstractions and natural lifecycle management.

See references/tool-coordination.md for implementations.

Critical Lifecycle: Cascading Stop

"Always stop children before stopping self." This prevents orphaned agents.

1. Get all child agents
2. Stop all children in parallel
3. Stop self
4. Cancel ongoing work
5. Flush events

If pause/resume unavailable, implement manual checkpointing: save agent state (messages, context, tool results), then restore later.

Production Hardening

ConcernSolution
Orphan DetectionHeartbeat monitoring every 30 seconds
Cost TrackingHierarchical aggregation across agent tree
Session PersistenceProject-level task store for cross-session work
CheckpointingSave after 10+ tools, $1.00 cost, or 5 minutes elapsed
Self-Modification SafetyBlast radius assessment, branch isolation, test-first

See references/production-hardening.md for detailed implementations.

Real-World Example: Code Review System

A pull request orchestrator using Fan-Out/Fan-In:

  1. Spawns four specialist reviewers in parallel (security, performance, style, tests)
  2. Security and tests use smart models (Sonnet); style and performance use fast models (Haiku)
  3. Each reviewer has 2-minute timeout
  4. Results aggregate regardless of partial failures
  5. Costs track per reviewer
  6. All agents stop cleanly via cascading stop after completion

Execution Checklist

When guiding implementation of multi-agent systems:

  1. Ask discovery questions - Understand requirements before architecting
  2. Assess error tolerance - Zero errors → MAKER; some acceptable → simpler patterns
  3. Establish four-layer architecture - Reasoning, orchestration, tool bus, adapters
  4. Design schema-first tools - Typed contracts before implementation
  5. Define deterministic boundary - No LLM in Layers 3-4
  6. Choose orchestration model - YOLO, Safety-First, or Hybrid
  7. Select coordination pattern - Fan-out, pipeline, delegation, queue, map-reduce, peer, or MAKER
  8. Design tool coordination - Permission inheritance, locking, rate limiting
  9. Implement cascading cleanup - Always stop children before parent
  10. Add monitoring and cost tracking - Hierarchical aggregation across agent tree
  11. Consider self-modification safety - If agents can modify code, add safety protocol

Common Pitfalls

PitfallImpact
Missing four-layer architectureUntestable, unsafe, hard to debug
LLM calls in tools (Layer 3-4)Non-deterministic, can't unit test
No schema-first tool designSub-agents can't discover tools
Missing cascading stopOrphaned agents consuming resources
No permission inheritanceSub-agents can escalate privileges
No timeoutsIndefinite hangs waiting for sub-agents
Unbounded concurrencyResource exhaustion from too many agents
Ignoring cost trackingBudget surprises
No partial-failure handlingOne failure cascades to all agents
Unpersisted stateUnrecoverable workflows on crash
Uncoordinated tool accessRace conditions on shared resources
Wrong model selectionCost inefficiency (Sonnet for simple tasks)
Self-modification without safetySub-agents break themselves
No heartbeat monitoringCan't detect orphans after parent crash

Reference Files

Detailed implementations with code examples:

FileContents
references/four-layer-architecture.mdFour-layer stack, deterministic boundary, schema-first tools
references/coordination-patterns.mdSeven coordination patterns with code
references/maker-pattern.mdMAKER implementation, voting, medical diagnosis example
references/tool-coordination.mdPermission inheritance, locking, rate limiting, caching
references/production-hardening.mdCascading stop, orphan detection, cost tracking, checkpointing