name: test-driven-development description: Use when writing tests or implementing code. Defines RED-GREEN-REFACTOR cycle and test execution workflow.

Test-Driven Development (TDD) Skill

This skill defines the proper TDD workflow: write test(s), run them (RED), implement minimal code (GREEN), run ALL tests (VERIFY), repeat.

Language-specific details:

For Python projects: See test-driven-development/PYTHON.md for pytest patterns, running tests, and Python-specific examples
Other languages: Add new files to test-driven-development/ directory as needed

TDD Cycle: RED → GREEN → REFACTOR

How Many Tests to Write at Once?

Write ONE test at a time when:

Design is uncertain or exploratory
Domain is complex and unfamiliar
You need to review and course correct after each step
Implementation approach is unclear

Write MULTIPLE tests upfront when:

Requirements are clear and well-understood
Domain is familiar
Design is straightforward
You're confident in the implementation approach

1. RED: Write Failing Test(s)

Write test(s) that fail.

Key principle: Each test must fail for the right reason (missing implementation, not syntax error).

CRITICAL - Run the tests - they MUST fail:

❌ DO NOT write implementation before running the tests
❌ DO NOT skip the RED phase
✅ DO run the tests to verify they fail
Verify they fail with expected error (import error, assertion failure, etc.)
Expected outcome: RED (tests fail)

Why RED is mandatory:

Confirms the tests actually test something
Prevents false positives (tests that always pass)
Validates test setup is correct

2. GREEN: Write Minimal Implementation

Write the MINIMUM code to make the tests pass.

Key principle: Don't add features not yet tested. Don't over-engineer.

Run the tests again:

Run the tests to verify they now pass
Expected outcome: GREEN (tests pass)

3. VERIFY: Run ALL Tests

CRITICAL: After tests pass, run ALL tests to ensure nothing broke.

Key principle: Never break existing functionality.

Run complete test suite:

Run all tests in the project
Verify all tests pass (new and existing)
Expected outcome: All tests GREEN

4. REFACTOR: Improve Code Quality

After tests pass, consider refactoring to improve code quality.

Use the refactor skill for detailed refactoring guidance.

Key principle: Refactor only when tests are GREEN. Tests protect you during refactoring.

When to refactor:

Code duplication exists
Code is unclear or hard to understand
Better patterns or abstractions are apparent
Performance can be improved
Before implementing new features that may duplicate existing patterns

Refactoring safety:

Run ALL tests after each refactoring change
If any test fails, revert and try a different approach
Keep refactoring changes small and incremental

Note: Refactoring is optional on each cycle. You can skip if code is already clean.

5. REPEAT: Continue

After all tests pass, continue with next requirements and repeat the cycle.

TDD Rules

Rule 1: Update Spec as You Go

✅ DO update SPEC.md to mark requirements as tested after tests pass
✅ Keep spec in sync with implementation progress

Updating SPEC.md markers:

After writing unit test that passes: [O][O] → [U][O] (test exists, code pending)
After code passes unit test: [U][O] → [U][X] (test and code complete)
After writing acceptance test that passes: [O][O] → [A][O] (acceptance test exists, code pending)
After code passes acceptance test: [A][O] → [A][X] (acceptance test and code complete)
When acceptance test passes, mark related unit-tested features as implemented: [U][O] → [U][X]

Rule 2: Always Run Tests

CRITICAL: Tests MUST go through the RED → GREEN → VERIFY cycle:

RED: Run tests to see them FAIL (before implementing)
GREEN: Run tests to see them PASS (after implementing)
VERIFY: Run ALL tests in the project to verify nothing broke

❌ NEVER skip the RED phase - Always see tests fail before implementing

Why this matters:

RED phase confirms tests actually test something (prevents false positives)
GREEN phase confirms implementation works
VERIFY phase ensures no regression in existing functionality

Rule 3: Minimal Implementation

Write ONLY enough code to make the current test pass
Don't add features not yet tested
Don't over-engineer solutions
Prefer simple over clever

Rule 4: Test One Thing

Each test validates ONE behavior
One assertion per test (unless assertions are intrinsically coupled)
Clear test names describe what's being tested
Multiple tests enable better parallelization

Rule 5: Fail Fast

If all tests don't pass:

Stop immediately
Fix the broken tests
Don't continue until all tests GREEN

TDD Workflow Diagram

┌─────────────────────────────────────────────────┐
│ Start: Pick requirements from spec             │
└────────────────┬────────────────────────────────┘
                 │
                 ▼
┌─────────────────────────────────────────────────┐
│ 1. RED: Write test(s) for requirement(s)       │
└────────────────┬────────────────────────────────┘
                 │
                 ▼
┌─────────────────────────────────────────────────┐
│ 2. Run tests                                   │
│    Expected: FAIL (RED)                        │
└────────────────┬────────────────────────────────┘
                 │
                 ▼
┌─────────────────────────────────────────────────┐
│ 3. GREEN: Write minimal implementation         │
└────────────────┬────────────────────────────────┘
                 │
                 ▼
┌─────────────────────────────────────────────────┐
│ 4. Run tests                                   │
│    Expected: PASS (GREEN)                      │
└────────────────┬────────────────────────────────┘
                 │
                 ▼
┌─────────────────────────────────────────────────┐
│ 5. VERIFY: Run ALL tests                       │
│    Expected: All PASS                          │
└────────────────┬────────────────────────────────┘
                 │
            All pass? ──No──> Fix broken tests
                 │
                Yes
                 │
                 ▼
┌─────────────────────────────────────────────────┐
│ 6. REFACTOR: Improve code                      │
│    Run ALL tests after each change             │
└────────────────┬────────────────────────────────┘
                 │
                 ▼
┌─────────────────────────────────────────────────┐
│ 7. More requirements? ──Yes──> Repeat          │
│                        │                        │
│                       No                        │
│                        │                        │
│                        ▼                        │
│                     Done                        │
└─────────────────────────────────────────────────┘

Test Organization

Unit Tests vs Acceptance Tests vs Integration Tests

Unit Tests:

Test individual functions/classes in isolation
Fast, focused, and run frequently
Located within the feature directory
Should cover the vast majority of your testing needs
Marked with [U] in SPEC.md

Acceptance Tests:

Test complete features from a user/business perspective
Verify requirements are met end-to-end within a feature
Located within the feature directory
Test scenarios from the spec (Given/When/Then)
May involve multiple units working together within the same feature
Marked with [A] in SPEC.md
When to use:
- Testing complete user workflows that require external APIs
- Verifying business requirements are satisfied
- Testing features that span multiple units within the feature
- End-to-end validation of a feature

Integration Tests:

Test interactions between multiple modules or external systems
Slower, more complex, expensive to maintain
Located in tests/ directory at project root
Use sparingly and only for tactical purposes:
- When unit tests cannot adequately verify the behavior
- Testing interactions with external dependencies (databases, APIs, LLMs)
- End-to-end workflows that span multiple modules
- Verifying third-party library integration (e.g., Pydantic AI model introspection)

Test Hierarchy:

Default to unit tests - fast, isolated, cover individual behaviors
Use acceptance tests - when you need end-to-end feature validation
Use integration tests sparingly - only for tactical external integration needs

Test Grouping

Organize tests by requirement sections:

Group related tests together
Use clear section/class names
Mirror the spec structure

Test Naming

Follow project conventions for test names:

Descriptive names that explain what's being tested
Include requirement reference in documentation
Use consistent naming pattern

Test Documentation

Each test should have:

Clear name describing behavior
Documentation linking to spec requirement
Given-When-Then structure

Integration with STDD Workflow

TDD is STEP 2 of the spec-test-driven development workflow:

Write Spec (use spec skill)
Write Tests ← (use this tdd skill)
Implement Code (minimal code to pass tests)

When writing tests:

Reference spec requirements in test documentation
Follow test organization from spec sections
Write tests for requirements from the spec
Decide whether to write one test at a time or multiple tests based on complexity and certainty (see "How Many Tests to Write at Once?" above)

Common Mistakes to Avoid

❌ Not Running Tests Enough

Skipping the RED phase (not verifying test fails)
Skipping the GREEN phase (not verifying test passes)
Skipping the VERIFY phase (not running all tests)

❌ Over-Implementing

Adding features not yet tested
Writing more code than needed to pass the test
Implementing based on assumptions rather than tests

❌ Writing Bad Tests

Tests that don't fail when they should
Tests with multiple assertions (unless intrinsically coupled)
Tests that don't clearly document what they're testing

Checklist Before Moving Forward

Use this checklist for your TDD workflow in this exact order:

Critical reminder: Steps 3-6 MUST happen in order:

RED (tests fail) → then implement → GREEN (tests pass) → then VERIFY (all tests pass)

test-driven-development

$ 安裝