Marketplace

quality-gates

Implement quality gates, user approval, iteration loops, and test-driven development. Use when validating with users, implementing feedback loops, classifying issue severity, running test-driven loops, or building multi-iteration workflows. Trigger keywords - "approval", "user validation", "iteration", "feedback loop", "severity", "test-driven", "TDD", "quality gate", "consensus".

$ Installieren

git clone https://github.com/MadAppGang/claude-code /tmp/claude-code && cp -r /tmp/claude-code/plugins/orchestration/skills/quality-gates ~/.claude/skills/claude-code

// tip: Run this command in your terminal to install the skill


name: quality-gates description: Implement quality gates, user approval, iteration loops, and test-driven development. Use when validating with users, implementing feedback loops, classifying issue severity, running test-driven loops, or building multi-iteration workflows. Trigger keywords - "approval", "user validation", "iteration", "feedback loop", "severity", "test-driven", "TDD", "quality gate", "consensus". version: 0.1.0 tags: [orchestration, quality-gates, approval, iteration, feedback, severity, test-driven, TDD] keywords: [approval, validation, iteration, feedback-loop, severity, test-driven, TDD, quality-gate, consensus, user-approval]

Quality Gates

Version: 1.0.0 Purpose: Patterns for approval gates, iteration loops, and quality validation in multi-agent workflows Status: Production Ready

Overview

Quality gates are checkpoints in workflows where execution pauses for validation before proceeding. They prevent low-quality work from advancing through the pipeline and ensure user expectations are met.

This skill provides battle-tested patterns for:

  • User approval gates (cost gates, quality gates, final acceptance)
  • Iteration loops (automated refinement until quality threshold met)
  • Issue severity classification (CRITICAL, HIGH, MEDIUM, LOW)
  • Multi-reviewer consensus (unanimous vs majority agreement)
  • Feedback loops (user reports issues โ†’ agent fixes โ†’ user validates)
  • Test-driven development loops (write tests โ†’ run โ†’ analyze failures โ†’ fix โ†’ repeat)

Quality gates transform "fire and forget" workflows into iterative refinement systems that consistently produce high-quality results.

Core Patterns

Pattern 1: User Approval Gates

When to Ask for Approval:

Use approval gates for:

  • Cost gates: Before expensive operations (multi-model review, large-scale refactoring)
  • Quality gates: Before proceeding to next phase (design validation before implementation)
  • Final validation: Before completing workflow (user acceptance testing)
  • Irreversible operations: Before destructive actions (delete files, database migrations)

How to Present Approval:

Good Approval Prompt:

"You selected 5 AI models for code review:
 - Claude Sonnet (embedded, free)
 - Grok Code Fast (external, $0.002)
 - Gemini 2.5 Flash (external, $0.001)
 - GPT-5 Codex (external, $0.004)
 - DeepSeek Coder (external, $0.001)

 Estimated total cost: $0.008 ($0.005 - $0.010)
 Expected duration: ~5 minutes

 Proceed with multi-model review? (Yes/No/Cancel)"

Why it works:
โœ“ Clear context (what will happen)
โœ“ Cost transparency (range, not single number)
โœ“ Time expectation (5 minutes)
โœ“ Multiple options (Yes/No/Cancel)

Anti-Pattern: Vague Approval

โŒ Wrong:

"This will cost money. Proceed? (Yes/No)"

Why it fails:
โœ— No cost details (how much?)
โœ— No context (what will happen?)
โœ— No alternatives (what if user says no?)

Handling User Responses:

User says YES:
  โ†’ Proceed with workflow
  โ†’ Track approval in logs
  โ†’ Continue to next step

User says NO:
  โ†’ Offer alternatives:
    1. Use fewer models (reduce cost)
    2. Use only free embedded Claude
    3. Skip this step entirely
    4. Cancel workflow
  โ†’ Ask user to choose alternative
  โ†’ Proceed based on choice

User says CANCEL:
  โ†’ Gracefully exit workflow
  โ†’ Save partial results (if any)
  โ†’ Log cancellation reason
  โ†’ Clean up temporary files
  โ†’ Notify user: "Workflow cancelled. Partial results saved to..."

Approval Bypasses (Advanced):

For automated workflows, allow approval bypass:

Automated Workflow Mode:

If workflow is triggered by CI/CD or scheduled task:
  โ†’ Skip user approval gates
  โ†’ Use predefined defaults (e.g., max cost $0.10)
  โ†’ Log decisions for audit trail
  โ†’ Email report to stakeholders after completion

Example:
  if (isAutomatedMode) {
    if (estimatedCost <= maxAutomatedCost) {
      log("Auto-approved: $0.008 <= $0.10 threshold");
      proceed();
    } else {
      log("Auto-rejected: $0.008 > $0.10 threshold");
      notifyStakeholders("Cost exceeds automated threshold");
      abort();
    }
  }

Pattern 2: Iteration Loop Patterns

Max Iteration Limits:

Always set a max iteration limit to prevent infinite loops:

Typical Iteration Limits:

Automated quality loops: 10 iterations
  - Designer validation โ†’ Developer fixes โ†’ Repeat
  - Test failures โ†’ Developer fixes โ†’ Repeat

User feedback loops: 5 rounds
  - User reports issues โ†’ Developer fixes โ†’ User validates โ†’ Repeat

Code review loops: 3 rounds
  - Reviewer finds issues โ†’ Developer fixes โ†’ Re-review โ†’ Repeat

Multi-model consensus: 1 iteration (no loop)
  - Parallel review โ†’ Consolidate โ†’ Present

Exit Criteria:

Define clear exit criteria for each loop type:

Loop Type: Design Validation

Exit Criteria (checked after each iteration):
  1. Designer assessment = PASS โ†’ Exit loop (success)
  2. Iteration count >= 10 โ†’ Exit loop (max iterations)
  3. User manually approves โ†’ Exit loop (user override)
  4. No changes made by developer โ†’ Exit loop (stuck, escalate)

Example:
  for (let i = 1; i <= 10; i++) {
    const review = await designer.validate();

    if (review.assessment === "PASS") {
      log("Design validation passed on iteration " + i);
      break;  // Success exit
    }

    if (i === 10) {
      log("Max iterations reached. Escalating to user validation.");
      break;  // Max iterations exit
    }

    await developer.fix(review.issues);
  }

Progress Tracking:

Show clear progress to user during iterations:

Iteration Loop Progress:

Iteration 1/10: Designer found 5 issues โ†’ Developer fixing...
Iteration 2/10: Designer found 3 issues โ†’ Developer fixing...
Iteration 3/10: Designer found 1 issue โ†’ Developer fixing...
Iteration 4/10: Designer assessment: PASS โœ“

Loop completed in 4 iterations.

Iteration History Documentation:

Track what happened in each iteration:

Iteration History (ai-docs/iteration-history.md):

## Iteration 1
Designer Assessment: NEEDS IMPROVEMENT
Issues Found:
  - Button color doesn't match design (#3B82F6 vs #2563EB)
  - Spacing between elements too tight (8px vs 16px)
  - Font size incorrect (14px vs 16px)
Developer Actions:
  - Updated button color to #2563EB
  - Increased spacing to 16px
  - Changed font size to 16px

## Iteration 2
Designer Assessment: NEEDS IMPROVEMENT
Issues Found:
  - Border radius too large (8px vs 4px)
Developer Actions:
  - Reduced border radius to 4px

## Iteration 3
Designer Assessment: PASS โœ“
Issues Found: None
Result: Design validation complete

Pattern 3: Issue Severity Classification

Severity Levels:

Use 4-level severity classification:

CRITICAL - Must fix immediately
  - Blocks core functionality
  - Security vulnerabilities (SQL injection, XSS, auth bypass)
  - Data loss risk
  - System crashes
  - Build failures

  Action: STOP workflow, fix immediately, re-validate

HIGH - Should fix soon
  - Major bugs (incorrect behavior)
  - Performance issues (>3s page load, memory leaks)
  - Accessibility violations (keyboard navigation broken)
  - User experience blockers

  Action: Fix in current iteration, proceed after fix

MEDIUM - Should fix
  - Minor bugs (edge cases, visual glitches)
  - Code quality issues (duplication, complexity)
  - Non-blocking performance issues
  - Incomplete error handling

  Action: Fix if time permits, or schedule for next iteration

LOW - Nice to have
  - Code style inconsistencies
  - Minor refactoring opportunities
  - Documentation improvements
  - Polish and optimization

  Action: Log for future improvement, proceed without fixing

Severity-Based Prioritization:

Issue List (sorted by severity):

CRITICAL Issues (must fix all before proceeding):
  1. SQL injection in user search endpoint
  2. Missing authentication check on admin routes
  3. Password stored in plaintext

HIGH Issues (fix before code review):
  4. Memory leak in WebSocket connection
  5. Missing error handling in payment flow
  6. Accessibility: keyboard navigation broken

MEDIUM Issues (fix if time permits):
  7. Code duplication in auth controllers
  8. Inconsistent error messages
  9. Missing JSDoc comments

LOW Issues (defer to future):
  10. Variable naming inconsistency
  11. Redundant type annotations
  12. CSS could use more specificity

Action Plan:
  - Fix CRITICAL (1-3) immediately โ†’ Re-run tests
  - Fix HIGH (4-6) before code review
  - Log MEDIUM (7-9) for next iteration
  - Ignore LOW (10-12) for now

Severity Escalation:

Issues can escalate in severity based on context:

Context-Based Escalation:

Issue: "Missing error handling in payment flow"
  Base Severity: MEDIUM (code quality issue)

  Context 1: Development environment
    โ†’ Severity: MEDIUM (not user-facing yet)

  Context 2: Production environment
    โ†’ Severity: HIGH (affects real users, money involved)

  Context 3: Production + recent payment failures
    โ†’ Severity: CRITICAL (actively causing issues)

Rule: Escalate severity when:
  - Issue affects production users
  - Issue involves money/security/data
  - Issue is currently causing failures

Pattern 4: Multi-Reviewer Consensus

Consensus Levels:

When multiple reviewers evaluate the same code/design:

UNANIMOUS (100% agreement):
  - ALL reviewers flagged this issue
  - VERY HIGH confidence
  - Highest priority (likely a real problem)

Example:
  3/3 reviewers: "SQL injection in search endpoint"
  โ†’ UNANIMOUS consensus
  โ†’ CRITICAL priority (all agree it's critical)

STRONG CONSENSUS (67-99% agreement):
  - MOST reviewers flagged this issue
  - HIGH confidence
  - High priority (probably a real problem)

Example:
  2/3 reviewers: "Missing input validation"
  โ†’ STRONG consensus (67%)
  โ†’ HIGH priority

MAJORITY (50-66% agreement):
  - HALF or more flagged this issue
  - MEDIUM confidence
  - Medium priority (worth investigating)

Example:
  2/3 reviewers: "Code duplication in controllers"
  โ†’ MAJORITY consensus (67%)
  โ†’ MEDIUM priority

DIVERGENT (< 50% agreement):
  - Only 1-2 reviewers flagged this issue
  - LOW confidence
  - Low priority (may be model-specific or false positive)

Example:
  1/3 reviewers: "Variable naming could be better"
  โ†’ DIVERGENT (33%)
  โ†’ LOW priority (one reviewer's opinion)

Consensus-Based Prioritization:

Prioritized Issue List (by consensus + severity):

1. [UNANIMOUS - CRITICAL] SQL injection in search
   ALL reviewers agree: Claude, Grok, Gemini (3/3)

2. [UNANIMOUS - HIGH] Missing input validation
   ALL reviewers agree: Claude, Grok, Gemini (3/3)

3. [STRONG - HIGH] Memory leak in WebSocket
   MOST reviewers agree: Claude, Grok (2/3)

4. [MAJORITY - MEDIUM] Code duplication
   HALF+ reviewers agree: Claude, Gemini (2/3)

5. [DIVERGENT - LOW] Variable naming
   SINGLE reviewer: Claude only (1/3)

Action:
  - Fix issues 1-2 immediately (unanimous + CRITICAL/HIGH)
  - Fix issue 3 before review (strong consensus)
  - Consider issue 4 (majority, but medium severity)
  - Ignore issue 5 (divergent, likely false positive)

Pattern 5: Feedback Loop Implementation

User Feedback Loop:

Workflow: User Validation with Feedback

Step 1: Initial Implementation
  Developer implements feature
  Designer/Tester validates
  Present to user for manual validation

Step 2: User Validation Gate (MANDATORY)
  Present to user:
    "Implementation complete. Please manually verify:
     - Open app at http://localhost:3000
     - Test feature: [specific instructions]
     - Compare to design reference

     Does it meet expectations? (Yes/No)"

Step 3a: User says YES
  โ†’ โœ… Feature approved
  โ†’ Generate final report
  โ†’ Mark workflow complete

Step 3b: User says NO
  โ†’ Collect specific feedback

Step 4: Collect Specific Feedback
  Ask user: "Please describe the issues you found:"

  User response:
    "1. Button color is wrong (should be blue, not green)
     2. Spacing is too tight between elements
     3. Font size is too small"

Step 5: Extract Structured Feedback
  Parse user feedback into structured issues:

  Issue 1:
    Component: Button
    Problem: Color incorrect
    Expected: Blue (#2563EB)
    Actual: Green (#10B981)
    Severity: MEDIUM

  Issue 2:
    Component: Container
    Problem: Spacing too tight
    Expected: 16px
    Actual: 8px
    Severity: MEDIUM

  Issue 3:
    Component: Text
    Problem: Font size too small
    Expected: 16px
    Actual: 14px
    Severity: LOW

Step 6: Launch Fixing Agent
  Task: ui-developer
    Prompt: "Fix user-reported issues:

             1. Button color: Change from #10B981 to #2563EB
             2. Container spacing: Increase from 8px to 16px
             3. Text font size: Increase from 14px to 16px

             User feedback: [user's exact words]"

Step 7: Re-validate
  After fixes:
    - Re-run designer validation
    - Loop back to Step 2 (user validation)

Step 8: Max Feedback Rounds
  Limit: 5 feedback rounds (prevent infinite loop)

  If round > 5:
    Escalate to human review
    "Unable to meet user expectations after 5 rounds.
     Manual intervention required."

Feedback Round Tracking:

Feedback Round History:

Round 1:
  User Issues: Button color, spacing, font size
  Fixes Applied: Updated all 3 issues
  Result: Re-validate

Round 2:
  User Issues: Border radius too large
  Fixes Applied: Reduced border radius
  Result: Re-validate

Round 3:
  User Issues: None
  Result: โœ… APPROVED

Total Rounds: 3/5

Pattern 6: Test-Driven Development Loop

When to Use:

Use TDD loop after implementing code, before code review:

Workflow Phases:

Phase 1: Architecture Planning
Phase 2: Implementation
Phase 2.5: Test-Driven Development Loop โ† THIS PATTERN
Phase 3: Code Review
Phase 4: User Acceptance

The TDD Loop Pattern:

Step 1: Write Tests First
  Task: test-architect
    Prompt: "Write comprehensive tests for authentication feature.
             Requirements: [link to requirements]
             Implementation: [link to code]"
    Output: tests/auth.test.ts

Step 2: Run Tests
  Bash: bun test tests/auth.test.ts
  Capture output and exit code

Step 3: Check Test Results
  If all tests pass:
    โ†’ โœ… TDD loop complete
    โ†’ Proceed to code review (Phase 3)

  If tests fail:
    โ†’ Analyze failure (continue to Step 4)

Step 4: Analyze Test Failure
  Task: test-architect
    Prompt: "Analyze test failure output:

             [test failure logs]

             Determine root cause:
             - TEST_ISSUE: Test has bug (bad assertion, missing mock, wrong expectation)
             - IMPLEMENTATION_ISSUE: Code has bug (logic error, missing validation, incorrect behavior)

             Provide detailed analysis."

  test-architect returns:
    verdict: TEST_ISSUE | IMPLEMENTATION_ISSUE
    analysis: Detailed explanation
    recommendation: Specific fix needed

Step 5a: If TEST_ISSUE (test is wrong)
  Task: test-architect
    Prompt: "Fix test based on analysis:
             [analysis from Step 4]"

  After fix:
    โ†’ Re-run tests (back to Step 2)
    โ†’ Loop continues

Step 5b: If IMPLEMENTATION_ISSUE (code is wrong)
  Provide structured feedback to developer:

  Task: backend-developer
    Prompt: "Fix implementation based on test failure:

             Test Failure:
             [failure output]

             Root Cause:
             [analysis from test-architect]

             Recommended Fix:
             [specific fix needed]"

  After fix:
    โ†’ Re-run tests (back to Step 2)
    โ†’ Loop continues

Step 6: Max Iteration Limit
  Limit: 10 iterations

  Iteration tracking:
    Iteration 1/10: 5 tests failed โ†’ Fix implementation
    Iteration 2/10: 2 tests failed โ†’ Fix test (bad mock)
    Iteration 3/10: All tests pass โœ…

  If iteration > 10:
    Escalate to human review
    "Unable to pass all tests after 10 iterations.
     Manual debugging required."

Example TDD Loop:

Phase 2.5: Test-Driven Development Loop

Iteration 1:
  Tests Run: 20 tests
  Results: 5 failed, 15 passed
  Failure: "JWT token validation fails with expired token"
  Analysis: IMPLEMENTATION_ISSUE - Missing expiration check
  Fix: Added expiration validation in TokenService
  Re-run: Continue to Iteration 2

Iteration 2:
  Tests Run: 20 tests
  Results: 2 failed, 18 passed
  Failure: "Mock database not reset between tests"
  Analysis: TEST_ISSUE - Missing beforeEach cleanup
  Fix: Added database reset in test setup
  Re-run: Continue to Iteration 3

Iteration 3:
  Tests Run: 20 tests
  Results: All passed โœ…
  Result: TDD loop complete, proceed to code review

Total Iterations: 3/10
Duration: ~5 minutes
Benefits:
  - Caught 2 bugs before code review
  - Fixed 1 test quality issue
  - All tests passing gives confidence in implementation

Benefits of TDD Loop:

Benefits:

1. Catch bugs early (before code review, not after)
2. Ensure test quality (test-architect fixes bad tests)
3. Automated quality assurance (no manual testing needed)
4. Fast feedback loop (seconds to run tests, not minutes)
5. Confidence in implementation (all tests passing)

Performance:
  Traditional: Implement โ†’ Review โ†’ Find bugs โ†’ Fix โ†’ Re-review
  Time: 30+ minutes, multiple review rounds

  TDD Loop: Implement โ†’ Test โ†’ Fix โ†’ Test โ†’ Review (with confidence)
  Time: 15 minutes, single review round (fewer issues)

Integration with Other Skills

quality-gates + multi-model-validation:

Use Case: Cost approval before multi-model review

Step 1: Estimate costs (multi-model-validation)
Step 2: User approval gate (quality-gates)
  If approved: Proceed with parallel execution
  If rejected: Offer alternatives
Step 3: Execute review (multi-model-validation)

quality-gates + multi-agent-coordination:

Use Case: Iteration loop with designer validation

Step 1: Agent selection (multi-agent-coordination)
  Select designer + ui-developer

Step 2: Iteration loop (quality-gates)
  For i = 1 to 10:
    - Run designer validation
    - If PASS: Exit loop
    - Else: Delegate to ui-developer for fixes

Step 3: User validation gate (quality-gates)
  Mandatory manual approval

quality-gates + error-recovery:

Use Case: Test-driven loop with error recovery

Step 1: Run tests (quality-gates TDD pattern)
Step 2: If test execution fails (error-recovery)
  - Syntax error โ†’ Fix and retry
  - Framework crash โ†’ Notify user, skip TDD
Step 3: If tests pass (quality-gates)
  - Proceed to code review

Best Practices

Do:

  • โœ… Set max iteration limits (prevent infinite loops)
  • โœ… Define clear exit criteria (PASS, max iterations, user override)
  • โœ… Track iteration history (document what happened)
  • โœ… Show progress to user ("Iteration 3/10 complete")
  • โœ… Classify issue severity (CRITICAL โ†’ HIGH โ†’ MEDIUM โ†’ LOW)
  • โœ… Prioritize by consensus + severity
  • โœ… Ask user approval for expensive operations
  • โœ… Collect specific feedback (not vague complaints)
  • โœ… Use TDD loop to catch bugs early

Don't:

  • โŒ Create infinite loops (no exit criteria)
  • โŒ Skip user validation gates (mandatory for UX)
  • โŒ Ignore consensus (unanimous issues are real)
  • โŒ Batch all severities together (prioritize CRITICAL)
  • โŒ Proceed without approval for >$0.01 operations
  • โŒ Collect vague feedback ("it's wrong" โ†’ what specifically?)
  • โŒ Skip TDD loop (catches bugs before expensive review)

Performance:

  • Iteration loops: 5-10 iterations typical, max 10-15 min
  • TDD loop: 3-5 iterations typical, max 5-10 min
  • User feedback: 1-3 rounds typical, max 5 rounds

Examples

Example 1: User Approval Gate for Multi-Model Review

Scenario: User requests multi-model review, costs $0.008

Execution:

Step 1: Estimate Costs
  Input: 450 lines ร— 1.5 = 675 tokens per model
  Output: 2000-4000 tokens per model
  Total: 3 models ร— 3000 avg = 9000 output tokens
  Cost: ~$0.008 ($0.005 - $0.010)

Step 2: Present Approval Gate
  "Multi-model review will analyze 450 lines with 3 AI models:
   - Claude Sonnet (embedded, free)
   - Grok Code Fast (external, $0.002)
   - Gemini 2.5 Flash (external, $0.001)

   Estimated cost: $0.008 ($0.005 - $0.010)
   Duration: ~5 minutes

   Proceed? (Yes/No/Cancel)"

Step 3a: User says YES
  โ†’ Proceed with parallel execution
  โ†’ Track approval: log("User approved $0.008 cost")

Step 3b: User says NO
  โ†’ Offer alternatives:
    1. Use only free Claude (no external models)
    2. Use only 1 external model (reduce cost to $0.002)
    3. Skip review entirely
  โ†’ Ask user to choose

Step 3c: User says CANCEL
  โ†’ Exit gracefully
  โ†’ Log: "User cancelled multi-model review"
  โ†’ Clean up temporary files

Example 2: Designer Validation Iteration Loop

Scenario: UI implementation with automated iteration until PASS

Execution:

Iteration 1:
  Task: designer
    Prompt: "Validate navbar against Figma design"
    Output: ai-docs/design-review-1.md
    Assessment: NEEDS IMPROVEMENT
    Issues:
      - Button color: #3B82F6 (expected #2563EB)
      - Spacing: 8px (expected 16px)

  Task: ui-developer
    Prompt: "Fix issues from ai-docs/design-review-1.md"
    Changes: Updated button color, increased spacing

  Result: Continue to Iteration 2

Iteration 2:
  Task: designer
    Prompt: "Re-validate navbar"
    Output: ai-docs/design-review-2.md
    Assessment: NEEDS IMPROVEMENT
    Issues:
      - Border radius: 8px (expected 4px)

  Task: ui-developer
    Prompt: "Fix border radius issue"
    Changes: Reduced border radius to 4px

  Result: Continue to Iteration 3

Iteration 3:
  Task: designer
    Prompt: "Re-validate navbar"
    Output: ai-docs/design-review-3.md
    Assessment: PASS โœ“
    Issues: None

  Result: Exit loop (success)

Summary:
  Total Iterations: 3/10
  Duration: ~8 minutes
  Automated Fixes: 3 issues resolved
  Result: PASS, proceed to user validation

Example 3: Test-Driven Development Loop

Scenario: Authentication implementation with TDD

Execution:

Phase 2.5: Test-Driven Development Loop

Iteration 1:
  Task: test-architect
    Prompt: "Write tests for authentication feature"
    Output: tests/auth.test.ts (20 tests)

  Bash: bun test tests/auth.test.ts
    Result: 5 failed, 15 passed

  Task: test-architect
    Prompt: "Analyze test failures"
    Verdict: IMPLEMENTATION_ISSUE
    Analysis: "Missing JWT expiration validation"

  Task: backend-developer
    Prompt: "Add JWT expiration validation"
    Changes: Updated TokenService.verify()

  Bash: bun test tests/auth.test.ts
    Result: Continue to Iteration 2

Iteration 2:
  Bash: bun test tests/auth.test.ts
    Result: 2 failed, 18 passed

  Task: test-architect
    Prompt: "Analyze test failures"
    Verdict: TEST_ISSUE
    Analysis: "Mock database not reset between tests"

  Task: test-architect
    Prompt: "Fix test setup"
    Changes: Added beforeEach cleanup

  Bash: bun test tests/auth.test.ts
    Result: Continue to Iteration 3

Iteration 3:
  Bash: bun test tests/auth.test.ts
    Result: All 20 passed โœ…

  Result: TDD loop complete, proceed to code review

Summary:
  Total Iterations: 3/10
  Duration: ~5 minutes
  Bugs Caught: 1 implementation bug, 1 test bug
  Result: All tests passing, high confidence in code

Troubleshooting

Problem: Infinite iteration loop

Cause: No exit criteria or max iteration limit

Solution: Always set max iterations (10 for automated, 5 for user feedback)

โŒ Wrong:
  while (true) {
    if (review.assessment === "PASS") break;
    fix();
  }

โœ… Correct:
  for (let i = 1; i <= 10; i++) {
    if (review.assessment === "PASS") break;
    if (i === 10) escalateToUser();
    fix();
  }

Problem: User approval skipped for expensive operation

Cause: Missing approval gate

Solution: Always ask approval for costs >$0.01

โŒ Wrong:
  if (userRequestedMultiModel) {
    executeReview();
  }

โœ… Correct:
  if (userRequestedMultiModel) {
    const cost = estimateCost();
    if (cost > 0.01) {
      const approved = await askUserApproval(cost);
      if (!approved) return offerAlternatives();
    }
    executeReview();
  }

Problem: All issues treated equally

Cause: No severity classification

Solution: Classify by severity, prioritize CRITICAL

โŒ Wrong:
  issues.forEach(issue => fix(issue));

โœ… Correct:
  const critical = issues.filter(i => i.severity === "CRITICAL");
  const high = issues.filter(i => i.severity === "HIGH");

  critical.forEach(issue => fix(issue));  // Fix critical first
  high.forEach(issue => fix(issue));      // Then high
  // MEDIUM and LOW deferred or skipped

Summary

Quality gates ensure high-quality results through:

  • User approval gates (cost, quality, final validation)
  • Iteration loops (automated refinement, max 10 iterations)
  • Severity classification (CRITICAL โ†’ HIGH โ†’ MEDIUM โ†’ LOW)
  • Consensus prioritization (unanimous โ†’ strong โ†’ majority โ†’ divergent)
  • Feedback loops (collect specific issues, fix, re-validate)
  • Test-driven development (write tests, run, fix, repeat until pass)

Master these patterns and your workflows will consistently produce high-quality, validated results.


Extracted From:

  • /review command (user approval for costs, consensus analysis)
  • /validate-ui command (iteration loops, user validation gates, feedback collection)
  • /implement command (PHASE 2.5 test-driven development loop)
  • Multi-model review patterns (consensus-based prioritization)