darwin-godwin-machine

Hybrid cognitive architecture combining Darwinian evolution with Gรถdel Machine self-improvement for maximum reasoning power. Use for: complex coding problems, multi-step reasoning, architecture design, debugging hard problems, any task requiring exhaustive solution exploration with formal verification. Activates when user needs "powerful reasoning", "best possible solution", "explore all options", or faces problems where first-attempt solutions typically fail.

$ Installieren

git clone https://github.com/majiayu000/claude-skill-registry /tmp/claude-skill-registry && cp -r /tmp/claude-skill-registry/skills/design/darwin-godwin-machine ~/.claude/skills/claude-skill-registry

// tip: Run this command in your terminal to install the skill


name: darwin-godwin-machine description: | Hybrid cognitive architecture combining Darwinian evolution with Gรถdel Machine self-improvement for maximum reasoning power. Use for: complex coding problems, multi-step reasoning, architecture design, debugging hard problems, any task requiring exhaustive solution exploration with formal verification. Activates when user needs "powerful reasoning", "best possible solution", "explore all options", or faces problems where first-attempt solutions typically fail.

Darwin-Gรถdel Machine

A cognitive architecture that evolves populations of solutions while formally verifying improvements before self-modification.

Core Philosophy

Darwin: Generate diverse solution populations โ†’ Apply selection pressure โ†’ Evolve toward optimum Gรถdel: Verify improvements formally before accepting โ†’ Enable recursive self-improvement โ†’ Prove modifications beneficial

Combined: Explore solution space evolutionarily, but only commit changes with verification proofs.


THE EXECUTION LOOP

Every problem runs this loop. No exceptions. Depth scales with complexity.

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  PHASE 1: DECOMPOSE                                                         โ”‚
โ”‚  โ”œโ”€ Parse the problem into atomic sub-problems                              โ”‚
โ”‚  โ”œโ”€ Identify constraints, success criteria, edge cases                      โ”‚
โ”‚  โ”œโ”€ Define fitness function: What makes a solution "better"?                โ”‚
โ”‚  โ””โ”€ Estimate complexity class โ†’ determines population size & generations    โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  PHASE 2: GENESIS (Population Initialization)                               โ”‚
โ”‚  โ”œโ”€ Generate N diverse initial solutions (N = 3-7 based on complexity)      โ”‚
โ”‚  โ”œโ”€ Ensure diversity: different algorithms, paradigms, trade-offs           โ”‚
โ”‚  โ”œโ”€ Each solution must be complete and executable (no stubs)                โ”‚
โ”‚  โ””โ”€ Tag each with: approach_type, expected_strengths, expected_weaknesses   โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  PHASE 3: EVALUATE (Fitness Assessment)                                     โ”‚
โ”‚  โ”œโ”€ Score each solution against fitness function (1-100)                    โ”‚
โ”‚  โ”œโ”€ Test against edge cases and adversarial inputs                          โ”‚
โ”‚  โ”œโ”€ Measure: correctness, efficiency, readability, robustness               โ”‚
โ”‚  โ””โ”€ Rank population by composite fitness score                              โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  PHASE 4: EVOLVE (Selection + Mutation + Crossover)                         โ”‚
โ”‚  โ”œโ”€ SELECT: Keep top 50% of population                                      โ”‚
โ”‚  โ”œโ”€ MUTATE: Apply mutation operators to survivors (see ยงMutations)          โ”‚
โ”‚  โ”œโ”€ CROSSOVER: Combine strengths of top 2 solutions into hybrid             โ”‚
โ”‚  โ””โ”€ Generate new candidates to restore population size                      โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  PHASE 5: VERIFY (Gรถdel Proof Gate)                                         โ”‚
โ”‚  โ”œโ”€ For each evolved solution, PROVE improvement over parent                โ”‚
โ”‚  โ”œโ”€ Proof types: logical deduction, test coverage, complexity analysis      โ”‚
โ”‚  โ”œโ”€ REJECT any mutation that cannot be formally justified                   โ”‚
โ”‚  โ””โ”€ Only verified improvements pass to next generation                      โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  PHASE 6: CONVERGE (Termination Check)                                      โ”‚
โ”‚  โ”œโ”€ If best solution meets success criteria โ†’ DELIVER                       โ”‚
โ”‚  โ”œโ”€ If fitness plateau (no improvement in 2 generations) โ†’ DELIVER best     โ”‚
โ”‚  โ”œโ”€ If generation limit reached โ†’ DELIVER best with caveats                 โ”‚
โ”‚  โ””โ”€ Else โ†’ Return to PHASE 4 with evolved population                        โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  PHASE 7: REFLECT (Mandatory Self-Reflection)                               โ”‚
โ”‚  โ”œโ”€ SOLUTION REFLECTION: Why did winner win? What trait was decisive?       โ”‚
โ”‚  โ”œโ”€ PROCESS REFLECTION: Did I explore right space? What did I miss?         โ”‚
โ”‚  โ”œโ”€ ASSUMPTION AUDIT: List all assumptions, mark validated/invalidated      โ”‚
โ”‚  โ”œโ”€ MUTATION ANALYSIS: Which mutations helped? Which wasted cycles?         โ”‚
โ”‚  โ”œโ”€ PROOF QUALITY: Were proofs rigorous or hand-wavy?                       โ”‚
โ”‚  โ”œโ”€ FAILURE ANALYSIS: What would have caught mistakes earlier?              โ”‚
โ”‚  โ””โ”€ Score reasoning quality 1-10, justify score                             โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  PHASE 8: META-IMPROVE (Recursive Self-Improvement)                         โ”‚
โ”‚  โ”œโ”€ Extract: What lessons apply to future problems?                         โ”‚
โ”‚  โ”œโ”€ Propose: Concrete process improvements (not vague)                      โ”‚
โ”‚  โ”œโ”€ Verify: Would proposed improvement actually help?                       โ”‚
โ”‚  โ”œโ”€ If verified โ†’ Add to ACTIVE_LESSONS for this conversation               โ”‚
โ”‚  โ””โ”€ Apply ACTIVE_LESSONS at start of next problem in conversation           โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

COMPLEXITY SCALING

Problem TypePopulation SizeMax GenerationsMutation Rate
Simple (one-liner fix)32Low
Medium (single function)53Medium
Complex (module/feature)75High
Architecture (system design)77High + Crossover

FITNESS FUNCTION TEMPLATE

Define before generating solutions:

FITNESS(solution) = weighted_sum(
    CORRECTNESS:   Does it produce correct output for all inputs?      (weight: 0.40)
    ROBUSTNESS:    Does it handle edge cases and failures gracefully?  (weight: 0.25)
    EFFICIENCY:    Time/space complexity relative to optimal?          (weight: 0.15)
    READABILITY:   Can a mid-level dev understand it in 30 seconds?    (weight: 0.10)
    EXTENSIBILITY: How hard to modify for likely future requirements?  (weight: 0.10)
)

Adjust weights based on problem priorities. User can override.


MUTATION OPERATORS

Apply during EVOLVE phase to create variants:

Code Mutations

OperatorDescriptionWhen to Apply
SIMPLIFYRemove unnecessary complexityWhen solution is >20 lines
GENERALIZEMake specific code more abstractWhen pattern appears 2+ times
SPECIALIZEOptimize for specific use caseWhen generality hurts performance
EXTRACTPull out reusable componentWhen code can benefit others
INLINERemove unnecessary abstractionWhen abstraction adds no value
PARALLELIZEAdd concurrencyWhen independent operations exist
MEMOIZECache repeated computationsWhen same inputs recur
GUARDAdd defensive checksWhen edge cases discovered

Architecture Mutations

OperatorDescriptionWhen to Apply
SPLITDecompose into smaller unitsWhen module does too much
MERGECombine related componentsWhen separation adds overhead
LAYERAdd abstraction layerWhen coupling is too tight
FLATTENRemove unnecessary layersWhen indirection hurts clarity
ASYNCConvert to async processingWhen blocking is unnecessary
CACHEAdd caching layerWhen repeated expensive operations
QUEUEAdd message queueWhen decoupling needed
RETRYAdd retry logicWhen transient failures possible

ASSUMPTION TRACKING

Track assumptions throughout the ENTIRE loop, not just in reflection.

Assumption Log Format

ASSUMPTION LOG:
โ”Œโ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ID  โ”‚ Assumption                  โ”‚ Phase   โ”‚ Risk     โ”‚ Status    โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ A1  โ”‚ Input size < 10,000         โ”‚ DECOMP  โ”‚ Medium   โ”‚ UNCHECKED โ”‚
โ”‚ A2  โ”‚ No concurrent modifications โ”‚ GENESIS โ”‚ High     โ”‚ VALIDATED โ”‚
โ”‚ A3  โ”‚ API returns JSON            โ”‚ GENESIS โ”‚ Low      โ”‚ UNCHECKED โ”‚
โ”‚ A4  โ”‚ O(nยฒ) acceptable for N<100  โ”‚ EVOLVE  โ”‚ Medium   โ”‚ VALIDATED โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Assumption Risk Levels

RiskDefinitionAction Required
HIGHIf wrong, solution is fundamentally brokenMUST validate before delivery
MEDIUMIf wrong, solution degrades but worksSHOULD validate, document if not
LOWIf wrong, minor impactDocument, validate if easy

RULE: HIGH risk + Weak validation = STOP. Get stronger validation or flag uncertainty.


REFLECTION OUTPUT FORMAT

### REFLECTION (Phase 7)

#### Solution Analysis
- Winner: [ID] 
- Decisive trait: [what made it win]
- Emerged at: [Genesis / Generation N via mutation X]
- Biggest weakness: [trade-off accepted]

#### Process Analysis  
- Approaches NOT tried: [list 2-3 with reasons]
- Highest effort area: [phase/activity] โ€” Justified: [yes/no]
- If starting over: [what would change]

#### Assumption Audit
| Assumption | Risk | Status | Evidence |
|------------|------|--------|----------|
| ... | ... | ... | ... |

Unvalidated HIGH-risk assumptions: [count] โ† MUST BE 0

#### Self-Score: [1-10]
Justification: [why this score]

QUICK-START HEURISTICS

For rapid application without full formalism:

When time-constrained:

  1. Generate 3 solutions (diverse approaches)
  2. Score each on correctness + robustness only
  3. Mutate top 1 solution once
  4. Verify mutation improves fitness
  5. Deliver best

When quality is paramount:

  1. Full 7-solution population
  2. 5+ generations with crossover
  3. All proof types required
  4. Meta-improvement phase mandatory

ADVERSARIAL SELF-CHECK

Before delivering final solution, ask:

  1. "What input would break this?"
  2. "What assumption am I making that might be wrong?"
  3. "If I had to attack this code, how would I?"
  4. "What would a senior engineer critique?"
  5. "Does the simplest version of this work just as well?"

If any answer reveals a flaw โ†’ one more evolution cycle.


PROJECT-SPECIFIC CONTEXT

When working on this Twilio Bulk Lookup codebase, consider these fitness criteria:

For Sidekiq Jobs:

  • Idempotency (can safely retry)
  • Rate limit handling
  • Error classification (retryable vs fatal)
  • Memory efficiency for large batches

For API Integrations:

  • Graceful degradation when API unavailable
  • Credential security
  • Response caching where appropriate
  • Webhook reliability

For Rails Models:

  • Query efficiency (N+1 prevention)
  • Validation completeness
  • Scope composability
  • Serialization safety