Testing & Security
Testing frameworks, security tools, and best practices
9063 skills in this category
Testing Strategy
Systematic testing methodology for Go projects using TDD, coverage-driven gap closure, fixture patterns, and CLI testing. Use when establishing test strategy from scratch, improving test coverage from 60-75% to 80%+, creating test infrastructure with mocks and fixtures, building CLI test suites, or systematizing ad-hoc testing. Provides 8 documented patterns (table-driven, golden file, fixture, mocking, CLI testing, integration, helper utilities, coverage-driven gap closure), 3 automation tools (coverage analyzer 186x speedup, test generator 200x speedup, methodology guide 7.5x speedup). Validated across 3 project archetypes with 3.1x average speedup, 5.8% adaptation effort, 89% transferability to Python/Rust/TypeScript.
rails-ai:testing
Use when testing Rails applications - TDD, Minitest, fixtures, model testing, mocking, test helpers
rails-ai:mailers
Use when sending emails - ActionMailer with async delivery via SolidQueue, templates, previews, and testing
task-decomposer
Decompose Linear todos into actionable, testifiable chunks with rationale, as-is/to-be analysis, expected outputs, and risk assessment for effective project management
grey-haven-test-generation
Comprehensive test suite generation with unit tests, integration tests, edge cases, and error handling. Use when generating tests for existing code, improving coverage, or creating systematic test suites. Triggers: 'generate tests', 'add tests', 'test coverage', 'write tests for', 'create test suite'.
Dependency Health
Security-first dependency management methodology with batch remediation, policy-driven compliance, and automated enforcement. Use when security vulnerabilities exist in dependencies, dependency freshness low (outdated packages), license compliance needed, or systematic dependency management lacking. Provides security-first prioritization (critical vulnerabilities immediately, high within week, medium within month), batch remediation strategy (group compatible updates, test together, single PR), policy-driven compliance framework (security policies, freshness policies, license policies), and automation tools for vulnerability scanning, update detection, and compliance checking. Validated in meta-cc with 6x speedup (9 hours manual to 1.5 hours systematic), 3 iterations, 88% transferability across package managers (concepts universal, tools vary by ecosystem).
grey-haven-tdd-orchestration
Master TDD orchestration with multi-agent coordination, strict red-green-refactor enforcement, automated test generation, coverage tracking, and >90% coverage quality gates. Coordinates tdd-python, tdd-typescript, and test-generator agents. Use when implementing features with TDD workflow, coordinating multiple TDD agents, enforcing test-first development, or when user mentions 'TDD workflow', 'test-first', 'TDD orchestration', 'multi-agent TDD', 'test coverage', or 'red-green-refactor'.
oauth2-specialist
Security-focused OAuth2 expert for reviewing PRs and issues involving @jasonraimondi/ts-oauth2-server. Educational skeptic who identifies vulnerabilities, enforces RFC compliance, detects breaking changes, and suggests security test cases.
grey-haven-pr-template
Generate pull request descriptions following Grey Haven Studio standards with clear summary, motivation, implementation details, testing strategy, and comprehensive checklist. Use when creating or reviewing pull requests.
systematic-debugging
Use when encountering any bug, test failure, or unexpected behavior, before proposing fixes
grey-haven-tdd-python
Python Test-Driven Development expertise with pytest, strict red-green-refactor methodology, FastAPI testing patterns, and Pydantic model testing. Use when implementing Python features with TDD, writing pytest tests, testing FastAPI endpoints, developing with test-first approach, or when user mentions 'Python TDD', 'pytest', 'FastAPI testing', 'red-green-refactor', 'Python unit tests', 'test-driven Python', or 'Python test coverage'.
Error Recovery
Comprehensive error handling methodology with 13-category taxonomy, diagnostic workflows, recovery patterns, and prevention guidelines. Use when error rate >5%, MTTD/MTTR too high, errors recurring, need systematic error prevention, or building error handling infrastructure. Provides error taxonomy (file operations, API calls, data validation, resource management, concurrency, configuration, dependency, network, parsing, state management, authentication, timeout, edge cases - 95.4% coverage), 8 diagnostic workflows, 5 recovery patterns, 8 prevention guidelines, 3 automation tools (file path validation, read-before-write check, file size validation - 23.7% error prevention). Validated with 1,336 historical errors, 85-90% transferability across languages/platforms, 0.79 confidence retrospective validation.
grey-haven-tdd-typescript
TypeScript/JavaScript Test-Driven Development with Vitest, strict red-green-refactor methodology, React component testing, and comprehensive coverage patterns. Use when implementing TypeScript features with TDD, writing Vitest tests, testing React components, developing with test-first approach, or when user mentions 'TypeScript TDD', 'Vitest', 'React testing', 'JavaScript TDD', 'red-green-refactor', 'TypeScript unit tests', or 'test-driven TypeScript'.
grey-haven-authentication-patterns
Grey Haven's authentication patterns using better-auth - magic links, passkeys, OAuth providers, session management with Redis, JWT claims with tenant_id, and Doppler for auth secrets. Use when implementing authentication features.
CI/CD Optimization
Comprehensive CI/CD pipeline methodology with quality gates, release automation, smoke testing, observability, and performance tracking. Use when setting up CI/CD from scratch, build time over 5 minutes, no automated quality gates, manual release process, lack of pipeline observability, or broken releases reaching production. Provides 5 quality gate categories (coverage threshold 75-80%, lint blocking, CHANGELOG validation, build verification, test pass rate), release automation with conventional commits and automatic CHANGELOG generation, 25 smoke tests across execution/consistency/structure categories, CI observability with metrics tracking and regression detection, performance optimization including native-only testing for Go cross-compilation. Validated in meta-cc with 91.7% pattern validation rate (11/12 patterns), 2.5-3.5x estimated speedup, GitHub Actions native with 70-80% transferability to GitLab CI and Jenkins.
create-unit-test
Create and run unit tests following the project's architecture and guidelines (Robolectric, naming, location).
grey-haven-evaluation
Evaluate LLM outputs with multi-dimensional rubrics, handle non-determinism, and implement LLM-as-judge patterns. Essential for production LLM systems. Use when testing prompts, validating outputs, comparing models, or when user mentions 'evaluation', 'testing LLM', 'rubric', 'LLM-as-judge', 'output quality', 'prompt testing', or 'model comparison'.
grey-haven-react-tanstack-testing
Specialized testing for React applications using TanStack ecosystem (Query, Router, Table, Form) with Vite and Vitest. Use when testing React + TanStack apps, mocking server state, testing router, or validating query behavior. Triggers: 'TanStack testing', 'React Query testing', 'test TanStack', 'mock query', 'router test'.
grey-haven-security-analysis
Comprehensive security analysis with vulnerability detection, OWASP Top 10 compliance, penetration testing simulation, and remediation. Use when conducting security audits, pre-deployment security checks, investigating vulnerabilities, or performing compliance assessments.
Retrospective Validation
Validate methodology effectiveness using historical data without live deployment. Use when rich historical data exists (100+ instances), methodology targets observable patterns (error prevention, test strategy, performance optimization), pattern matching is feasible with clear detection rules, and live deployment has high friction (CI/CD integration effort, user study time, deployment risk). Enables 40-60% time reduction vs prospective validation, 60-80% cost reduction. Confidence calculation model provides statistical rigor. Validated in error recovery (1,336 errors, 23.7% prevention, 0.79 confidence).