LLM & Agents
6763 skills in Data & AI > LLM & Agents
grey-haven-llm-project-development
Build LLM-powered applications and pipelines using proven methodology - task-model fit analysis, pipeline architecture, structured outputs, file-based state, and cost estimation. Use when building AI features, data processing pipelines, agents, or any LLM-integrated system. Inspired by Karpathy's methodology and production case studies.
design-prompt-optimizer
基于 DesignPrompt 设计原则的专业设计评估与优化技能,专注于界面的视觉表现。 当你说"优化设计"或"看看这个设计怎么样",我会自动激活。 核心聚焦于视觉设计层面:排版布局、色彩搭配、视觉层次、字体使用、间距留白等。 弱化交互设计和代码实现,专注于让界面看起来更美、更专业。
Rapid Convergence
Achieve 3-4 iteration methodology convergence (vs standard 5-7) when clear baseline metrics exist, domain scope is focused, and direct validation is possible. Use when you have V_meta baseline ≥0.40, quantifiable success criteria, retrospective validation data, and generic agents are sufficient. Enables 40-60% time reduction (10-15 hours vs 20-30 hours) without sacrificing quality. Prediction model helps estimate iteration count during experiment planning. Validated in error recovery (3 iterations, 10 hours, V_instance=0.83, V_meta=0.85).
grey-haven-testing-strategy
Grey Haven's comprehensive testing strategy - Vitest unit/integration/e2e for TypeScript, pytest markers for Python, >80% coverage requirement, fixture patterns, and Doppler for test environments. Use when writing tests, setting up test infrastructure, running tests, debugging test failures, improving coverage, configuring CI/CD, or when user mentions 'test', 'testing', 'pytest', 'vitest', 'coverage', 'TDD', 'test-driven development', 'unit test', 'integration test', 'e2e', 'end-to-end', 'test fixtures', 'mocking', 'test setup', 'CI testing'.
Testing Strategy
Systematic testing methodology for Go projects using TDD, coverage-driven gap closure, fixture patterns, and CLI testing. Use when establishing test strategy from scratch, improving test coverage from 60-75% to 80%+, creating test infrastructure with mocks and fixtures, building CLI test suites, or systematizing ad-hoc testing. Provides 8 documented patterns (table-driven, golden file, fixture, mocking, CLI testing, integration, helper utilities, coverage-driven gap closure), 3 automation tools (coverage analyzer 186x speedup, test generator 200x speedup, methodology guide 7.5x speedup). Validated across 3 project archetypes with 3.1x average speedup, 5.8% adaptation effort, 89% transferability to Python/Rust/TypeScript.
subagent-driven-development
Use when executing implementation plans with independent tasks in the current session
Error Recovery
Comprehensive error handling methodology with 13-category taxonomy, diagnostic workflows, recovery patterns, and prevention guidelines. Use when error rate >5%, MTTD/MTTR too high, errors recurring, need systematic error prevention, or building error handling infrastructure. Provides error taxonomy (file operations, API calls, data validation, resource management, concurrency, configuration, dependency, network, parsing, state management, authentication, timeout, edge cases - 95.4% coverage), 8 diagnostic workflows, 5 recovery patterns, 8 prevention guidelines, 3 automation tools (file path validation, read-before-write check, file size validation - 23.7% error prevention). Validated with 1,336 historical errors, 85-90% transferability across languages/platforms, 0.79 confidence retrospective validation.
grey-haven-test-generation
Comprehensive test suite generation with unit tests, integration tests, edge cases, and error handling. Use when generating tests for existing code, improving coverage, or creating systematic test suites. Triggers: 'generate tests', 'add tests', 'test coverage', 'write tests for', 'create test suite'.
grey-haven-tdd-orchestration
Master TDD orchestration with multi-agent coordination, strict red-green-refactor enforcement, automated test generation, coverage tracking, and >90% coverage quality gates. Coordinates tdd-python, tdd-typescript, and test-generator agents. Use when implementing features with TDD workflow, coordinating multiple TDD agents, enforcing test-first development, or when user mentions 'TDD workflow', 'test-first', 'TDD orchestration', 'multi-agent TDD', 'test coverage', or 'red-green-refactor'.
hook-intercept-block
This skill should be used when implementing slash commands that execute without Claude API calls. Use when: adding a new /bumper-* command, understanding why commands return "block" responses, debugging UserPromptSubmit hooks, or learning the pattern for instant command execution. Keywords: UserPromptSubmit, block decision, hook response, slash command implementation.
Agent Prompt Evolution
Track and optimize agent specialization during methodology development. Use when agent specialization emerges (generic agents show >5x performance gap), multi-experiment comparison needed, or methodology transferability analysis required. Captures agent set evolution (Aₙ tracking), meta-agent evolution (Mₙ tracking), specialization decisions (when/why to create specialized agents), and reusability assessment (universal vs domain-specific vs task-specific). Enables systematic cross-experiment learning and optimized M₀ evolution. 2-3 hours overhead per experiment.
grey-haven-tdd-python
Python Test-Driven Development expertise with pytest, strict red-green-refactor methodology, FastAPI testing patterns, and Pydantic model testing. Use when implementing Python features with TDD, writing pytest tests, testing FastAPI endpoints, developing with test-first approach, or when user mentions 'Python TDD', 'pytest', 'FastAPI testing', 'red-green-refactor', 'Python unit tests', 'test-driven Python', or 'Python test coverage'.
grey-haven-skill-creator
Guide for creating effective skills that extend Claude's capabilities. Use when users want to create a new skill, update an existing skill, or need guidance on skill structure and best practices. Triggers: 'create skill', 'new skill', 'skill template', 'build skill', 'skill structure', 'skill design'.
designprompt
AI驱动的设计系统构建器。基于项目特征智能推荐最合适的设计风格(从30+专业设计系统中选择),或使用用户指定的风格。自动应用完整的设计系统规范(颜色、字体、组件、动效等)来实现界面。
siliconflow-api-skills
硅基流动(SiliconFlow)云服务平台文档。用于大语言模型 API 调用、图片生成、向量模型、在 Claude Code 中使用硅基流动、Chat Completions API、Stream 模式等。
plugin-authoring
Use when creating, modifying, or debugging Claude Code plugins. Triggers on .claude-plugin/, plugin.json, marketplace.json, commands/, agents/, skills/, hooks/ directories. Provides schemas, templates, validation workflows, and troubleshooting.
grey-haven-prompt-engineering
Master 26 documented prompt engineering principles for crafting effective LLM prompts with 400%+ quality improvement. Includes templates, anti-patterns, and quality checklists for technical, learning, creative, and research tasks. Use when writing prompts for LLMs, improving AI response quality, training on prompting, designing agent instructions, or when user mentions 'prompt engineering', 'better prompts', 'LLM quality', 'prompt templates', 'AI prompts', 'prompt principles', or 'prompt optimization'.
grey-haven-evaluation
Evaluate LLM outputs with multi-dimensional rubrics, handle non-determinism, and implement LLM-as-judge patterns. Essential for production LLM systems. Use when testing prompts, validating outputs, comparing models, or when user mentions 'evaluation', 'testing LLM', 'rubric', 'LLM-as-judge', 'output quality', 'prompt testing', or 'model comparison'.
subagent-prompt-construction
Systematic methodology for constructing compact (<150 lines), expressive, Claude Code-integrated subagent prompts using lambda contracts and symbolic logic. Use when creating new specialized subagents for Claude Code with agent composition, MCP tool integration, or skill references. Validated with phase-planner-executor (V_instance=0.895).
grey-haven-tool-design
Design effective MCP tools and Claude Code integrations using the consolidation principle. Fewer, better-designed tools dramatically improve agent success rates. Use when creating MCP servers, designing tool interfaces, optimizing tool sets, or when user mentions 'tool design', 'MCP', 'fewer tools', 'tool consolidation', 'tool architecture', or 'tool optimization'.