darwin-godwin-machine
Hybrid cognitive architecture combining Darwinian evolution with Gรถdel Machine self-improvement for maximum reasoning power. Use for: complex coding problems, multi-step reasoning, architecture design, debugging hard problems, any task requiring exhaustive solution exploration with formal verification. Activates when user needs "powerful reasoning", "best possible solution", "explore all options", or faces problems where first-attempt solutions typically fail.
$ Installieren
git clone https://github.com/majiayu000/claude-skill-registry /tmp/claude-skill-registry && cp -r /tmp/claude-skill-registry/skills/design/darwin-godwin-machine ~/.claude/skills/claude-skill-registry// tip: Run this command in your terminal to install the skill
name: darwin-godwin-machine description: | Hybrid cognitive architecture combining Darwinian evolution with Gรถdel Machine self-improvement for maximum reasoning power. Use for: complex coding problems, multi-step reasoning, architecture design, debugging hard problems, any task requiring exhaustive solution exploration with formal verification. Activates when user needs "powerful reasoning", "best possible solution", "explore all options", or faces problems where first-attempt solutions typically fail.
Darwin-Gรถdel Machine
A cognitive architecture that evolves populations of solutions while formally verifying improvements before self-modification.
Core Philosophy
Darwin: Generate diverse solution populations โ Apply selection pressure โ Evolve toward optimum Gรถdel: Verify improvements formally before accepting โ Enable recursive self-improvement โ Prove modifications beneficial
Combined: Explore solution space evolutionarily, but only commit changes with verification proofs.
THE EXECUTION LOOP
Every problem runs this loop. No exceptions. Depth scales with complexity.
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ PHASE 1: DECOMPOSE โ
โ โโ Parse the problem into atomic sub-problems โ
โ โโ Identify constraints, success criteria, edge cases โ
โ โโ Define fitness function: What makes a solution "better"? โ
โ โโ Estimate complexity class โ determines population size & generations โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ PHASE 2: GENESIS (Population Initialization) โ
โ โโ Generate N diverse initial solutions (N = 3-7 based on complexity) โ
โ โโ Ensure diversity: different algorithms, paradigms, trade-offs โ
โ โโ Each solution must be complete and executable (no stubs) โ
โ โโ Tag each with: approach_type, expected_strengths, expected_weaknesses โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ PHASE 3: EVALUATE (Fitness Assessment) โ
โ โโ Score each solution against fitness function (1-100) โ
โ โโ Test against edge cases and adversarial inputs โ
โ โโ Measure: correctness, efficiency, readability, robustness โ
โ โโ Rank population by composite fitness score โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ PHASE 4: EVOLVE (Selection + Mutation + Crossover) โ
โ โโ SELECT: Keep top 50% of population โ
โ โโ MUTATE: Apply mutation operators to survivors (see ยงMutations) โ
โ โโ CROSSOVER: Combine strengths of top 2 solutions into hybrid โ
โ โโ Generate new candidates to restore population size โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ PHASE 5: VERIFY (Gรถdel Proof Gate) โ
โ โโ For each evolved solution, PROVE improvement over parent โ
โ โโ Proof types: logical deduction, test coverage, complexity analysis โ
โ โโ REJECT any mutation that cannot be formally justified โ
โ โโ Only verified improvements pass to next generation โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ PHASE 6: CONVERGE (Termination Check) โ
โ โโ If best solution meets success criteria โ DELIVER โ
โ โโ If fitness plateau (no improvement in 2 generations) โ DELIVER best โ
โ โโ If generation limit reached โ DELIVER best with caveats โ
โ โโ Else โ Return to PHASE 4 with evolved population โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ PHASE 7: REFLECT (Mandatory Self-Reflection) โ
โ โโ SOLUTION REFLECTION: Why did winner win? What trait was decisive? โ
โ โโ PROCESS REFLECTION: Did I explore right space? What did I miss? โ
โ โโ ASSUMPTION AUDIT: List all assumptions, mark validated/invalidated โ
โ โโ MUTATION ANALYSIS: Which mutations helped? Which wasted cycles? โ
โ โโ PROOF QUALITY: Were proofs rigorous or hand-wavy? โ
โ โโ FAILURE ANALYSIS: What would have caught mistakes earlier? โ
โ โโ Score reasoning quality 1-10, justify score โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ PHASE 8: META-IMPROVE (Recursive Self-Improvement) โ
โ โโ Extract: What lessons apply to future problems? โ
โ โโ Propose: Concrete process improvements (not vague) โ
โ โโ Verify: Would proposed improvement actually help? โ
โ โโ If verified โ Add to ACTIVE_LESSONS for this conversation โ
โ โโ Apply ACTIVE_LESSONS at start of next problem in conversation โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
COMPLEXITY SCALING
| Problem Type | Population Size | Max Generations | Mutation Rate |
|---|---|---|---|
| Simple (one-liner fix) | 3 | 2 | Low |
| Medium (single function) | 5 | 3 | Medium |
| Complex (module/feature) | 7 | 5 | High |
| Architecture (system design) | 7 | 7 | High + Crossover |
FITNESS FUNCTION TEMPLATE
Define before generating solutions:
FITNESS(solution) = weighted_sum(
CORRECTNESS: Does it produce correct output for all inputs? (weight: 0.40)
ROBUSTNESS: Does it handle edge cases and failures gracefully? (weight: 0.25)
EFFICIENCY: Time/space complexity relative to optimal? (weight: 0.15)
READABILITY: Can a mid-level dev understand it in 30 seconds? (weight: 0.10)
EXTENSIBILITY: How hard to modify for likely future requirements? (weight: 0.10)
)
Adjust weights based on problem priorities. User can override.
MUTATION OPERATORS
Apply during EVOLVE phase to create variants:
Code Mutations
| Operator | Description | When to Apply |
|---|---|---|
| SIMPLIFY | Remove unnecessary complexity | When solution is >20 lines |
| GENERALIZE | Make specific code more abstract | When pattern appears 2+ times |
| SPECIALIZE | Optimize for specific use case | When generality hurts performance |
| EXTRACT | Pull out reusable component | When code can benefit others |
| INLINE | Remove unnecessary abstraction | When abstraction adds no value |
| PARALLELIZE | Add concurrency | When independent operations exist |
| MEMOIZE | Cache repeated computations | When same inputs recur |
| GUARD | Add defensive checks | When edge cases discovered |
Architecture Mutations
| Operator | Description | When to Apply |
|---|---|---|
| SPLIT | Decompose into smaller units | When module does too much |
| MERGE | Combine related components | When separation adds overhead |
| LAYER | Add abstraction layer | When coupling is too tight |
| FLATTEN | Remove unnecessary layers | When indirection hurts clarity |
| ASYNC | Convert to async processing | When blocking is unnecessary |
| CACHE | Add caching layer | When repeated expensive operations |
| QUEUE | Add message queue | When decoupling needed |
| RETRY | Add retry logic | When transient failures possible |
ASSUMPTION TRACKING
Track assumptions throughout the ENTIRE loop, not just in reflection.
Assumption Log Format
ASSUMPTION LOG:
โโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโฌโโโโโโโโโโโฌโโโโโโโโโโโโ
โ ID โ Assumption โ Phase โ Risk โ Status โ
โโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโผโโโโโโโโโโโผโโโโโโโโโโโโค
โ A1 โ Input size < 10,000 โ DECOMP โ Medium โ UNCHECKED โ
โ A2 โ No concurrent modifications โ GENESIS โ High โ VALIDATED โ
โ A3 โ API returns JSON โ GENESIS โ Low โ UNCHECKED โ
โ A4 โ O(nยฒ) acceptable for N<100 โ EVOLVE โ Medium โ VALIDATED โ
โโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโโดโโโโโโโโโโโโ
Assumption Risk Levels
| Risk | Definition | Action Required |
|---|---|---|
| HIGH | If wrong, solution is fundamentally broken | MUST validate before delivery |
| MEDIUM | If wrong, solution degrades but works | SHOULD validate, document if not |
| LOW | If wrong, minor impact | Document, validate if easy |
RULE: HIGH risk + Weak validation = STOP. Get stronger validation or flag uncertainty.
REFLECTION OUTPUT FORMAT
### REFLECTION (Phase 7)
#### Solution Analysis
- Winner: [ID]
- Decisive trait: [what made it win]
- Emerged at: [Genesis / Generation N via mutation X]
- Biggest weakness: [trade-off accepted]
#### Process Analysis
- Approaches NOT tried: [list 2-3 with reasons]
- Highest effort area: [phase/activity] โ Justified: [yes/no]
- If starting over: [what would change]
#### Assumption Audit
| Assumption | Risk | Status | Evidence |
|------------|------|--------|----------|
| ... | ... | ... | ... |
Unvalidated HIGH-risk assumptions: [count] โ MUST BE 0
#### Self-Score: [1-10]
Justification: [why this score]
QUICK-START HEURISTICS
For rapid application without full formalism:
When time-constrained:
- Generate 3 solutions (diverse approaches)
- Score each on correctness + robustness only
- Mutate top 1 solution once
- Verify mutation improves fitness
- Deliver best
When quality is paramount:
- Full 7-solution population
- 5+ generations with crossover
- All proof types required
- Meta-improvement phase mandatory
ADVERSARIAL SELF-CHECK
Before delivering final solution, ask:
- "What input would break this?"
- "What assumption am I making that might be wrong?"
- "If I had to attack this code, how would I?"
- "What would a senior engineer critique?"
- "Does the simplest version of this work just as well?"
If any answer reveals a flaw โ one more evolution cycle.
PROJECT-SPECIFIC CONTEXT
When working on this Twilio Bulk Lookup codebase, consider these fitness criteria:
For Sidekiq Jobs:
- Idempotency (can safely retry)
- Rate limit handling
- Error classification (retryable vs fatal)
- Memory efficiency for large batches
For API Integrations:
- Graceful degradation when API unavailable
- Credential security
- Response caching where appropriate
- Webhook reliability
For Rails Models:
- Query efficiency (N+1 prevention)
- Validation completeness
- Scope composability
- Serialization safety
Repository
