skill-validator

Validates skills against production-level criteria with 9-category scoring. This skill should be used when reviewing, auditing, or improving skills to ensure quality standards. Evaluates structure, content, user interaction, documentation, domain standards, technical robustness, maintainability, zero-shot implementation, and reusability. Returns actionable validation report with scores and improvement recommendations.

$ 安裝

git clone https://github.com/panaversity/claude-code-skills-lab /tmp/claude-code-skills-lab && cp -r /tmp/claude-code-skills-lab/.claude/skills/skill-validator ~/.claude/skills/claude-code-skills-lab

// tip: Run this command in your terminal to install the skill


name: skill-validator description: | Validates skills against production-level criteria with 9-category scoring. This skill should be used when reviewing, auditing, or improving skills to ensure quality standards. Evaluates structure, content, user interaction, documentation, domain standards, technical robustness, maintainability, zero-shot implementation, and reusability. Returns actionable validation report with scores and improvement recommendations.

Skill Validator

Validate any skill against production-level quality criteria.

Validation Workflow

Phase 1: Gather Context

  1. Read the skill's SKILL.md completely
  2. Identify skill type from frontmatter description:
    • Builder skill (creates artifacts)
    • Guide skill (provides instructions)
    • Automation skill (executes workflows)
    • Analyzer skill (extracts insights)
    • Validator skill (enforces quality)
    • Hybrid skill (combination of above)
  3. Read all reference files in references/ directory
  4. Check for assets/scripts directories
  5. Note frontmatter fields (name, description, allowed-tools, model)

Phase 2: Apply Criteria

Evaluate against 9 criteria categories. Each criterion scores 0-3:

  • 0: Missing/Absent
  • 1: Present but inadequate
  • 2: Adequate implementation
  • 3: Excellent implementation

Criteria Categories

1. Structure & Anatomy (Weight: 12%)

CriterionWhat to Check
SKILL.md existsRoot file present
Line count<500 lines (context is precious)
Frontmatter completename and description present in YAML
Name constraintsLowercase, numbers, hyphens only; ≤64 chars; matches directory
Description format[What] + [When] format; ≤1024 chars
Description styleThird-person: "This skill should be used when..."
No extraneous filesNo README.md, CHANGELOG.md, LICENSE in skill dir
Progressive disclosureDetails in references/, not bloated SKILL.md
Asset organizationTemplates in assets/, scripts in scripts/
Large file guidanceIf references >10k words, grep patterns in SKILL.md

Fail condition: Missing SKILL.md or >800 lines = automatic fail

2. Content Quality (Weight: 15%)

CriterionWhat to Check
ConcisenessNo verbose explanations, context is public good
Imperative formInstructions use "Do X" not "You should do X"
Appropriate freedomConstraints where needed, flexibility where safe
Scope clarityClear what skill does AND does not do
No hallucination riskNo instructions that encourage making up info
Output specificationClear expected outputs defined

3. User Interaction (Weight: 12%)

CriterionWhat to Check
Clarification triggersAsks questions before acting on ambiguity
Required vs optionalDistinguishes must-know from nice-to-know
Graceful handlingWhat to do when user doesn't answer
No over-askingDoesn't ask obvious or inferrable questions
Question pacingAvoids too many questions in single message
Context awarenessUses available context before asking

Key pattern to look for:

## Required Clarifications
1. Question about X
2. Question about Y

## Optional Clarifications
3. Question about Z (if relevant)

Note: Avoid asking too many questions in a single message.

4. Documentation & References (Weight: 10%)

CriterionWhat to Check
Source URLsOfficial documentation links provided
Reference filesComplex details in references/ not main file
Fetch guidanceInstructions to fetch docs for unlisted patterns
Version awarenessNotes about checking for latest patterns
Example coverageGood/bad examples for key patterns

Key pattern to look for:

| Resource | URL | Use For |
|----------|-----|---------|
| Official Docs | https://... | Complex cases |

5. Domain Standards (Weight: 10%)

CriterionWhat to Check
Best practicesFollows domain conventions (e.g., WCAG, OWASP)
Enforcement mechanismChecklists, validation steps, must-verify items
Anti-patternsLists what NOT to do
Quality gatesOutput checklist before delivery

Key pattern to look for:

### Must Follow
- [ ] Requirement 1
- [ ] Requirement 2

### Must Avoid
- Antipattern 1
- Antipattern 2

6. Technical Robustness (Weight: 8%)

CriterionWhat to Check
Error handlingGuidance for failure scenarios
Security considerationsInput validation, secrets handling if relevant
DependenciesExternal tools/APIs documented
Edge casesCommon edge cases addressed
TestabilityCan outputs be verified?

7. Maintainability (Weight: 8%)

CriterionWhat to Check
ModularityReferences are self-contained topics
Update pathEasy to update when standards change
No hardcoded valuesUses placeholders/variables where appropriate
Clear organizationLogical section ordering

8. Zero-Shot Implementation (Weight: 12%)

Skills should enable single-interaction implementation with embedded expertise.

CriterionWhat to Check
Before Implementation sectionContext gathering guidance present
Codebase contextGuidance to scan existing structure/patterns
Conversation contextUses discussed requirements/decisions
Embedded expertiseDomain knowledge in references/, not runtime discovery
User-only questionsOnly asks for USER requirements, not domain knowledge

Key pattern to look for:

## Before Implementation

Gather context to ensure successful implementation:

| Source | Gather |
|--------|--------|
| **Codebase** | Existing structure, patterns, conventions |
| **Conversation** | User's specific requirements |
| **Skill References** | Domain patterns from `references/` |
| **User Guidelines** | Project-specific conventions |

Red flag: Skill instructs to "research" or "discover" domain knowledge at runtime instead of embedding it.

9. Reusability (Weight: 13%)

Skills should handle variations, not single requirements.

CriterionWhat to Check
Handles variationsNot hardcoded to single use case
Variable elementsClarifications capture what VARIES
Constant patternsDomain best practices encoded as constants
Not requirement-specificAvoids hardcoded data, tools, configs
Abstraction levelAppropriate generalization for domain

Good example:

"Create visualizations - adaptable to data shape, chart type, library"

Bad example (too specific):

"Create bar chart with sales data using Recharts"

Key check: Does the skill work for multiple use cases within its domain?


Type-Specific Validation

After scoring general criteria, verify type-specific requirements:

TypeMust Have
BuilderClarifications, Output Spec, Domain Standards, Output Checklist
GuideWorkflow Steps, Examples (Good/Bad), Official Docs links
AutomationScripts in scripts/, Dependencies, Error Handling, I/O Spec
AnalyzerAnalysis Scope, Evaluation Criteria, Output Format, Synthesis
ValidatorQuality Criteria, Scoring Rubric, Thresholds, Remediation

Scoring: Deduct 10 points if type-specific requirements missing for identified type.


Scoring Guide

Category Scores

Calculate each category score:

Category Score = (Sum of criterion scores) / (Max possible) * 100

Overall Score

Overall = Σ(Category Score × Weight)

Rating Thresholds

ScoreRatingMeaning
90-100ProductionReady for wide use
75-89GoodMinor improvements needed
60-74AdequateFunctional but needs work
40-59DevelopingSignificant gaps
0-39IncompleteMajor rework required

Output Format

Generate validation report:

# Skill Validation Report: [skill-name]

**Rating**: [Production/Good/Adequate/Developing/Incomplete]
**Overall Score**: [X]/100

## Summary
[2-3 sentence assessment]

## Category Scores

| Category | Score | Weight | Weighted |
|----------|-------|--------|----------|
| Structure & Anatomy | X/100 | 12% | X |
| Content Quality | X/100 | 15% | X |
| User Interaction | X/100 | 12% | X |
| Documentation | X/100 | 10% | X |
| Domain Standards | X/100 | 10% | X |
| Technical Robustness | X/100 | 8% | X |
| Maintainability | X/100 | 8% | X |
| Zero-Shot Implementation | X/100 | 12% | X |
| Reusability | X/100 | 13% | X |
| **Type-Specific Deduction** | -X | - | -X |

## Critical Issues (if any)
- [Issue requiring immediate fix]

## Improvement Recommendations
1. **High Priority**: [Specific action]
2. **Medium Priority**: [Specific action]
3. **Low Priority**: [Specific action]

## Strengths
- [What skill does well]

Quick Validation Checklist

For rapid assessment, check these critical items:

Structure & Frontmatter

  • SKILL.md <500 lines
  • Frontmatter: name (≤64 chars, lowercase, hyphens) + description (≤1024 chars)
  • Description uses third-person style ("This skill should be used when...")
  • No README.md/CHANGELOG.md in skill directory

Content & Interaction

  • Has clarification questions (Required vs Optional)
  • Has output specification
  • Has official documentation links

Zero-Shot & Reusability

  • Has "Before Implementation" section (context gathering)
  • Domain expertise embedded in references/ (not runtime discovery)
  • Handles variations (not requirement-specific)

Type-Specific (check based on skill type)

  • Builder: Clarifications + Output Spec + Standards + Checklist
  • Guide: Workflow + Examples + Docs
  • Automation: Scripts + Dependencies + Error Handling
  • Analyzer: Scope + Criteria + Output Format
  • Validator: Criteria + Scoring + Thresholds + Remediation

If 10+ checked: Likely Production (90+) If 7-9 checked: Likely Good (75-89) If 5-6 checked: Likely Adequate (60-74) If <5 checked: Needs significant work


Reference Files

FileWhen to Read
references/detailed-criteria.mdDeep evaluation of specific criterion
references/scoring-examples.mdExample validations for calibration
references/improvement-patterns.mdCommon fixes for common issues

Usage Examples

Validate a skill

Validate the chatgpt-widget-creator skill against production criteria

Quick audit

Quick validation check on mcp-builder skill

Focused review

Check if skill-creator skill has proper user interaction patterns