Unnamed Skill

Use when designing/modifying system architecture or evaluating technology choices. Enforces 7-section TodoWrite with 22+ items. Triggers: "design architecture", "system design", "architectural decision", "should we use [tech]", "compare [A] vs [B]", "add new service", "microservices", "database choice", "API design", "scale to [X] users", "infrastructure decision". If thinking ANY of these, USE THIS SKILL: "quick recommendation is fine", "obvious choice", "we already know the answer", "just need to pick one", "simple architecture question".

$ Instalar

git clone https://github.com/pvillega/claude-templates /tmp/claude-templates && cp -r /tmp/claude-templates/.claude/skills/architecture-discipline ~/.claude/skills/claude-templates

// tip: Run this command in your terminal to install the skill


name: architecture-discipline description: Use when designing/modifying system architecture or evaluating technology choices. Enforces 7-section TodoWrite with 22+ items. Triggers: "design architecture", "system design", "architectural decision", "should we use [tech]", "compare [A] vs [B]", "add new service", "microservices", "database choice", "API design", "scale to [X] users", "infrastructure decision". If thinking ANY of these, USE THIS SKILL: "quick recommendation is fine", "obvious choice", "we already know the answer", "just need to pick one", "simple architecture question".

Architecture Discipline

When to Use

Use when decisions affect:

  • Data models or schema changes
  • Service boundaries or new services
  • Deployment topology or infrastructure
  • Scale characteristics (10x growth implications)
  • Technology stack choices
  • API contracts or external integrations

Skip for:

  • Parameter tweaks or configuration changes
  • UI/styling changes
  • Localization or copy changes
  • Bug fixes within existing architecture
  • Adding fields to existing models (unless schema migration)

Threshold: If the decision could cause a 3+ month re-architecture project if wrong, use this skill.

CRITICAL: This Is Reasoning Discipline, Not a Checklist

The 7 sections are a reasoning sequence, not boxes to check:

  • Alternatives BEFORE choosing
  • Scale requirements BEFORE design
  • Failure modes built in, not bolted on

🚨 If you wrote architecture without starting with all 7 sections: DELETE and restart. Retrofitting analysis is rationalization, not evaluation.


MANDATORY FIRST STEP

CREATE TodoWrite with these 7 sections (22+ items total):

SectionMinimum Items
Scale Analysis4+
Architectural Options3+
Ripple Effect Analysis5+
Failure Modes3+
Observability3+
Documentation2+
Migration/Compatibility2+

Do not design, propose solutions, or implement until TodoWrite is verified.


Verification Checkpoint

After creating TodoWrite, verify 3 random items pass this test:

Each item must have ALL THREE:

  • ✓ Concrete numbers/thresholds ("100K users", "$500/mo", "P95 < 500ms")
  • ✓ Specific tools/technologies ("PostgreSQL", "Redis", "CloudWatch")
  • ✓ Measurable outcome ("handles 1M req/sec", "costs $X at 10x")
❌ FAILS✅ PASSES
"Add monitoring""CloudWatch: websocket.connections.active, alert if >5% error rate via PagerDuty"
"Evaluate caching""Compare: Redis (1ms, $300/mo) vs In-memory LRU (0.1ms, $0) vs No cache (100ms)"
"Analyze scale""Current: 100K DAU, 50 req/sec. 10x: 1M users, 500 req/sec. Bottleneck: PostgreSQL connection pool"

DO NOT PROCEED until 22+ items AND quality check passes.


Section Requirements

1. Scale Analysis (4+ items)

NEVER design for current scale only. Before proposing any solution:

  • Current scale: Users (DAU/MAU), requests/sec, data volume, read/write ratio
  • 10x scale: What numbers at 10x? When expected?
  • Bottlenecks: What breaks at 10x? (DB connections, API limits, memory)
  • Mitigation: Specific solution for each bottleneck

2. Architectural Options (3+ items)

NEVER present single solution. Minimum 3 distinct options, each with:

  • Performance: Latency (P50/P95/P99), throughput, scale limit
  • Complexity: LOC estimate, services involved, operational burden
  • Cost: Infrastructure ($X/mo current, $Y/mo at 10x), development (engineer-weeks)
  • Trade-offs: Specific advantages (✅) and disadvantages (❌)

If stakeholder suggests solution: Add as Option A, evaluate with SAME rigor as alternatives.

3. Ripple Effect Analysis (5+ items)

Changes propagate across layers. Analyze ALL:

  • Data layer: Schema changes, migrations, indexes, query performance
  • Services: Which need updates? API contracts changed?
  • API: Breaking changes? Version bump? Backward compatibility?
  • Clients: Mobile updates? Web UI changes?
  • Operations: Deployment changes? New monitoring? Cost changes?

4. Failure Modes (3+ items)

For each mode:

  • Scenario: [Component] fails because [reason]
  • Detection: How we know (metrics drop, error rate spike)
  • Impact: What breaks (user features, data integrity)
  • Mitigation: Circuit breaker, fallback, redundancy

5. Observability (3+ items)

  • Metrics: Specific (latency P95, error rate %, throughput)
  • Alerts: Conditions (error rate > 5%, latency P95 > 500ms)
  • Dashboards: Key visualizations

6. Documentation (2+ items)

  • ADR: Chosen option, rejected alternatives, trade-offs, constraints
  • Diagram: New components, data flows, failure paths

7. Migration/Compatibility (2+ items)

  • Backward compatibility: Old clients work? API versioning?
  • Migration path: Phased rollout, feature flags, rollback procedure

Red Flags - STOP When You Think:

ThoughtReality
"Analysis paralysis"This IS the analysis that prevents expensive mistakes
"We'll add scale/alternatives/failure modes later"Retrofitting costs 5-10x more
"CTO already decided"Still needs independent evaluation
"Being pragmatic not dogmatic"These requirements ARE pragmatic
"Just a simple feature"Simple becomes complex at scale
"We already know the solution"Compare 3 alternatives first
"Keep it simple"Simple for current scale = complex re-architecture at 10x
"I can add missing sections to existing work"DELETE and restart

Override Requirements

To skip ANY requirement, you MUST provide ALL 4:

  1. Specific retrofit date (not "later")
  2. Budget allocated (engineer-weeks)
  3. Risk acceptance signed by decision maker
  4. Interim mitigation plan
SkippedRiskCost
Scale AnalysisRe-architecture in 6-12 months3-6 month project, 5-10x cost
AlternativesOptimize wrong dimension2-4 month migration
Failure ModesProduction incidents$5-50K per incident
Ripple EffectsBroken clients, data issuesDeployment failures

Verification Before Complete

CategoryRequirements
Scale✓ Current + 10x projected ✓ Bottlenecks ✓ Mitigations
Trade-offs✓ 3+ options ✓ Performance/complexity/cost ✓ Rationale
Impact✓ All layers analyzed ✓ Breaking changes identified
Failure✓ Specific modes ✓ Detection ✓ Mitigation ✓ Rollback
Documentation✓ ADR ✓ Diagram updated

If any item missing, do not proceed to implementation.