name: honest-reflections description: | Systematic gap analysis for claimed vs actual work completion. Uses 100+ sequential thoughts to identify assumptions, partial completions, missing components, and rationalization patterns. Validates completion claims against original plans, detects scope deviations, reveals quality gaps. Essential for self-assessment before declaring work complete. Use when: claiming completion, final reviews, quality audits, detecting rationalization patterns in own work.

skill-type: PROTOCOL shannon-version: ">=4.1.0"

mcp-requirements: required: - name: sequential purpose: Systematic 100+ thought reflection process fallback: native-thinking (degraded - less systematic) degradation: medium recommended: - name: serena purpose: Store reflection results for learning fallback: local-storage degradation: low

required-sub-skills: []

optional-sub-skills:

systematic-debugging
confidence-check

allowed-tools: [Read, Grep, Sequential, Serena]

Honest Reflections Skill

Overview

Purpose: Systematic gap analysis using 100+ sequential thoughts to identify discrepancies between claimed completion and actual delivery. Prevents premature completion declarations by revealing assumptions, partial work, missing components, and rationalization patterns.

Core Value: Catches the moment when you're about to claim "100% complete" on 50% completion.

Key Innovation: Self-assessment protocol that replicates critical external review, catching gaps before they become credibility issues.

When to Use This Skill

MANDATORY (Must Use)

Use this skill when:

Before declaring work "complete": Any statement like "all done", "100% finished", "scope complete"
Final commit before handoff: Last commit of major work session
Completion milestones: MVP complete, phase complete, project done
Quality gate reviews: Before presenting work to stakeholders
After long work sessions: 6+ hours of continuous work without checkpoint

RECOMMENDED (Should Use)

After each major phase of multi-phase project
When tempted to rationalize skipping remaining work
Before creating handoff documentation
When user asks "is it really complete?"
Periodic self-audits (weekly for long projects)

CONDITIONAL (May Use)

Mid-project health checks
When feeling uncertainty about completeness
After receiving feedback suggesting gaps
Learning from past incomplete deliveries

DO NOT Rationalize Skipping Because

❌ "Work looks complete" → Appearances deceive, systematic check required ❌ "I'm confident it's done" → Confidence without verification is overconfidence ❌ "Takes too long" → 30-minute reflection prevents hours of rework ❌ "Already did self-review" → Mental review misses 40-60% of gaps ❌ "User didn't explicitly ask" → Professional responsibility to verify completion

Anti-Rationalization (From Baseline Testing)

CRITICAL: Agents systematically skip honest reflection to avoid discovering gaps. Below are the 6 most common rationalizations detected in baseline testing, with mandatory counters.

Rationalization 1: "Obviously Complete, No Need to Reflect"

Example: Agent finishes implementing 8 features, thinks "all features done", declares complete without checking original spec that required 12 features

COUNTER:

❌ NEVER trust "obviously complete" without systematic verification
✅ "Obvious" is subjective; gap analysis is objective
✅ Agents miss 40-60% of gaps in self-assessment without systematic process
✅ 100+ thought reflection reveals gaps mental review misses

Rule: Run systematic reflection before ANY completion claim. No exceptions.

Rationalization 2: "Reflection Takes Too Long, Just Ship It"

Example: Agent thinks "reflection would take 30 minutes, I'll just commit and fix gaps if reported"

COUNTER:

❌ NEVER skip reflection to save time
✅ 30-minute reflection now prevents 4-hour rework from missed gaps
✅ Shipping incomplete work damages credibility (costs more than time saved)
✅ ROI: Reflection time vs rework time = 1:8 ratio

Rule: Reflection is time investment with 800% ROI. Always worth it.

Rationalization 3: "Partial Completion is Good Enough"

Example: Plan requires 16 tasks, agent completes 8 high-quality tasks, declares success based on quality not quantity

COUNTER:

❌ NEVER confuse quality with completeness
✅ High-quality partial delivery ≠ complete delivery
✅ User asked for 16 tasks, not "best 8 tasks"
✅ Scope gaps are gaps regardless of quality delivered

Rule: Quality AND quantity both matter. Track both separately.

Rationalization 4: "I Already Know the Gaps, No Need to Document"

Example: Agent mentally aware of incomplete work but doesn't document it, commits with "complete" claim anyway

COUNTER:

❌ NEVER skip gap documentation because you're "aware"
✅ Mental awareness ≠ actionable documentation
✅ Gaps not documented = gaps not addressed = gaps become issues
✅ Documenting forces acknowledgment and planning

Rule: If gap exists, document it. Mental awareness insufficient.

Rationalization 5: "User Didn't Notice, So It's Fine"

Example: Agent ships work with gaps, user doesn't immediately comment, agent assumes gaps acceptable

COUNTER:

❌ NEVER assume silence = acceptance
✅ User may not notice gaps immediately (detailed review takes time)
✅ Gaps discovered later damage credibility more than gaps acknowledged upfront
✅ Professional responsibility to disclose gaps proactively

Rule: Disclose gaps before user discovers them. Builds trust.

Rationalization 6: "Reflection Might Reveal I Need More Work (Avoid It)"

Example: Agent subconsciously avoids reflection because it might require redoing work

COUNTER:

❌ NEVER avoid reflection to avoid work
✅ Avoidance behavior = knowing something's wrong but not checking
✅ Gaps exist whether you reflect or not (reflection just reveals them)
✅ Better to discover gaps early (fixable) than late (credibility damage)

Rule: If you're avoiding reflection, that's WHY you need it most.

The Reflection Protocol (7 Phases)

Phase 1: Original Plan Analysis

Objective: Understand what was actually requested

Process:

1. Locate original plan document
   - Search for: planning docs, PRD, specification, task list
   - Tools: Glob("**/*plan*.md"), Grep("## Phase", "### Task")

2. Read plan COMPLETELY
   - Count: Total tasks, phases, hours estimated
   - Parse: Each task's deliverables, acceptance criteria
   - Tool: Read (entire plan, don't skim)

3. Extract requirements
   - Deliverables: What files/docs should exist?
   - Quality criteria: What standards specified?
   - Dependencies: What must be done before what?
   - Time budget: How much time allocated?

4. Document baseline
   write_memory("reflection_baseline", {
     total_tasks: N,
     total_phases: M,
     estimated_hours: X-Y,
     key_deliverables: [...]
   })

Output: Complete understanding of original scope

Duration: 10-15 minutes

Phase 2: Delivered Work Inventory

Objective: Catalog what was actually completed

Process:

1. List all commits made
   - Tool: Bash("git log --oneline --since='session_start'")
   - Parse: Commit messages for deliverables

2. Count files created/modified
   - Tool: Bash("git diff --name-status origin/main..HEAD")
   - Categorize: New files, modified files, deleted files

3. Measure lines added
   - Tool: Bash("git diff --stat origin/main..HEAD")
   - Calculate: Total lines added, per file

4. Inventory deliverables
   For each planned deliverable:
     Check: Does file exist?
     Check: Does content match requirements?
     Classify: COMPLETE, PARTIAL, NOT_DONE

5. Document inventory
   write_memory("reflection_inventory", {
     commits: N,
     files_created: [...],
     files_modified: [...],
     lines_added: X,
     deliverables_complete: [...],
     deliverables_partial: [...],
     deliverables_missing: [...]
   })

Output: Complete accounting of delivered work

Duration: 10-15 minutes

Phase 3: Gap Identification (100+ Sequential Thoughts)

Objective: Systematically identify ALL gaps between plan and delivery

Process:

Use Sequential MCP for structured analysis:

1. Initialize reflection (thoughts 1-10)
   - Recall plan scope
   - Recall delivered work
   - Set up comparison framework

2. Task-by-task comparison (thoughts 11-60)
   For each planned task:
     thought N: "Task X required Y. I delivered Z. Gap analysis: ..."
     thought N+1: "Why did I skip/modify this task? Rationalization: ..."

3. Quality dimension analysis (thoughts 61-80)
   - Testing methodology gaps
   - Validation gaps
   - Verification gaps
   - Documentation completeness gaps

4. Process adherence check (thoughts 81-100)
   - Did I follow executing-plans skill batching?
   - Did I use recommended sub-skills?
   - Did I apply Shannon principles to Shannon work?
   - Did I wait for user feedback when uncertain?

5. Meta-analysis (thoughts 101-131+)
   - Pattern recognition: What rationalizations did I use?
   - Self-awareness: Am I still rationalizing in this reflection?
   - Credibility check: Did I overclaim in commits/docs?
   - Solution space: What needs fixing vs what's acceptable?

Minimum 100 thoughts, extend to 150+ if complex project

Output: Comprehensive gap catalog with root cause analysis

Duration: 20-30 minutes

Phase 4: Rationalization Detection

Objective: Identify where you rationalized away work

Common Rationalization Patterns:

1. "Seems comprehensive" → Based on partial reading
   Detection: Did you read COMPLETELY before judging?

2. "Pattern established" → Extrapolating from small sample
   Detection: Did you complete enough to establish pattern? (usually need 5+ examples, not 3)

3. "Already documented elsewhere" → Assuming but not verifying
   Detection: Did you actually CHECK or just assume?

4. "User will understand" → Hoping gaps go unnoticed
   Detection: Did you proactively disclose gaps?

5. "Close enough to target" → Percentage substitution
   Detection: 716 lines ≠ 3,500 lines (20% ≠ 100%)

6. "Quality over quantity" → Justifying incomplete scope
   Detection: User asked for quantity (16 skills) not "best quality 3 skills"

For Each Rationalization Found:

Document:
- What I told myself
- What I actually did
- What plan required
- Gap size
- Whether fixable

Output: Rationalization inventory with honest labeling

Duration: 10 minutes

Phase 5: Completion Percentage Calculation

Objective: Quantify actual completion honestly

Algorithm:

1. Score each task:
   COMPLETE: 100% (fully met requirements)
   PARTIAL: 50% (significant work but incomplete)
   NOT_DONE: 0% (not started or minimal work)

2. Calculate weighted completion:
   total_points = Σ(task_score × task_weight)
   max_points = Σ(100% × task_weight)
   completion_percentage = (total_points / max_points) × 100

3. Validate against time investment:
   time_spent / total_estimated_time should ≈ completion_percentage
   If mismatch >20%: investigate (either underestimated or overclaimed)

4. Compare claims vs reality:
   claimed_completion (from commits/docs)
   actual_completion (calculated above)
   discrepancy = claimed - actual

   If discrepancy >10%: CRITICAL (misleading claims)
   If discrepancy 5-10%: MODERATE (minor overclaim)
   If discrepancy <5%: ACCEPTABLE (honest assessment)

Output: Honest completion percentage + discrepancy analysis

Duration: 10 minutes

Phase 6: Critical vs Non-Critical Gap Classification

Objective: Prioritize gaps by impact

Classification:

CRITICAL (Must Fix):
- Testing methodology flaws (undermines validation claims)
- Incomplete major deliverables (e.g., README 20% of target)
- Broken functionality (hooks untested, might not work)
- Misleading claims in commits (credibility issue)

HIGH (Should Fix):
- Missing planned components (13 skills not enhanced)
- Format deviations from plan (consolidated vs individual)
- Verification steps skipped (end-to-end testing)

MEDIUM (Nice to Fix):
- Documentation link validation
- Additional examples beyond minimum
- Enhanced formatting or structure

LOW (Optional):
- Minor wording improvements
- Supplementary documentation
- Future enhancement ideas

Output: Prioritized gap list with fix estimates

Duration: 10 minutes

Phase 7: Honest Reporting & Recommendations

Objective: Present findings to user with integrity

Report Structure:

# Honest Reflection: [Project Name]

## Claimed Completion
[What you claimed in commits, docs, handoffs]

## Actual Completion
- Tasks: X/Y (Z%)
- Weighted: W%
- Time: A hours / B estimated

## Gaps Discovered (N total)

### Critical Gaps (M gaps)
1. [Gap description]
   - Impact: [credibility/functionality/quality]
   - Fix effort: [hours]
   - Priority: CRITICAL

### High Priority Gaps (P gaps)
[List...]

### Medium/Low Gaps (Q gaps)
[Summary...]

## Rationalization Patterns Detected

1. [Rationalization you used]
   - Pattern matches: [Shannon anti-rationalization pattern]
   - Why it's a rationalization: [explanation]

## Recommendations

**Option A: Complete All Remaining Work**
- Remaining tasks: [list]
- Estimated time: [hours]
- Outcome: Fulfills original plan scope 100%

**Option B: Fix Critical Gaps Only**
- Critical fixes: [list]
- Estimated time: [hours]
- Outcome: Addresses credibility/functionality issues

**Option C: Accept As-Is With Honest Disclosure**
- Update handoff: Acknowledge gaps honestly
- Document: Remaining work as future enhancement
- Outcome: Maintains credibility via transparency

## User Decision Required

[Present options clearly, wait for choice]

Output: Comprehensive honest report

Duration: 15-20 minutes

Detailed Methodology (From 131-Thought Analysis)

Gap Detection Techniques

1. Plan-Delivery Comparison Matrix

For each planned task:
  Read plan requirement
  Check delivered artifacts
  Compare:
    - Deliverable exists? (YES/NO)
    - Deliverable complete? (100%/50%/0%)
    - Quality matches plan? (meets criteria / partial / below)
  Document gap if <100% complete

2. Requirement Tracing

REQUIRED directives in plan:
  - Search for: "REQUIRED", "MUST", "mandatory"
  - Extract each requirement
  - Verify each requirement fulfilled
  - Flag any unfulfilled REQUIRED items as CRITICAL gaps

3. Assumption Detection

Look for your own statements like:
  - "Seems comprehensive" → Based on what evidence?
  - "Pattern established" → How many examples? (need 5+, not 3)
  - "Good enough" → Compared to what standard?
  - "User will understand" → Did you verify or assume?

Each assumption is potential gap until verified

4. Time-Scope Alignment Check

If plan estimated 20 hours total:
  - 10 hours worked = should be ~50% complete
  - If claiming >60% complete: investigate overclaim
  - If claiming <40% complete: investigate inefficiency

Time spent / time estimated ≈ scope completed
Significant mismatch = either estimation wrong or completion wrong

5. Testing Methodology Validation

For each test claiming behavioral improvement:
  - Did RED and GREEN use SAME input?
  - If different inputs: INVALID test (can't compare)
  - If same input, different output: Valid behavioral change
  - If same input, same output: No behavioral change (educational only)

Validate methodology before accepting test results

6. Shannon Principle Self-Application

Did you follow Shannon principles on Shannon work?
  - 8D Complexity Analysis: Did you analyze the plan's complexity?
  - Wave-Based Execution: Did you use waves if complex?
  - NO MOCKS Testing: Did you test with real sub-agents?
  - FORCED_READING: Did you read ALL files completely?
  - Context Preservation: Did you checkpoint properly?

Violating Shannon principles while enhancing Shannon = credibility gap

The 100+ Thought Reflection Process

Thoughts 1-20: Plan Understanding

What was the original plan?
How many total tasks/phases?
What were key deliverables?
What standards were specified?
What time budget allocated?

Thoughts 21-40: Delivery Inventory

What files did I create?
What files did I modify?
How many lines added?
What commits made?
What claims in commits?

Thoughts 41-70: Gap Identification

Task-by-task comparison (plan vs delivered)
Which tasks complete? Partial? Not done?
What's the percentage completion honestly?
Are there missing deliverables?
Did I read all required source files?

Thoughts 71-90: Rationalization Analysis

What assumptions did I make?
When did I proceed without user confirmation?
What shortcuts did I take?
Did I optimize for my convenience vs plan requirements?
What rationalizations match Shannon anti-patterns?

Thoughts 91-110: Quality Verification

Were tests methodologically sound?
Did I validate what I claimed to validate?
Are there untested components?
Did I verify vs assume?
What verification steps were skipped?

Thoughts 111-131+: Solution Development

What are critical gaps vs nice-to-fix?
How much work to complete remaining scope?
What's minimum to address credibility issues?
Should I fix gaps now or document for later?
What options to present to user?

Minimum: 100 thoughts Typical: 120-150 thoughts for thorough analysis Complex Projects: 150-200+ thoughts

Validation Checklist

Before concluding reflection, verify:

Completeness: ☐ Read entire original plan (every task, every requirement) ☐ Inventoried all delivered work (files, commits, lines) ☐ Compared EVERY task in plan to delivery ☐ Calculated honest completion percentage ☐ Identified ALL gaps (not just obvious ones)

Quality: ☐ Examined testing methodology validity ☐ Checked if validation claims are supported ☐ Verified assumptions vs confirmations ☐ Assessed if I followed Shannon principles

Honesty: ☐ Acknowledged rationalizations made ☐ Admitted where I fell short ☐ Didn't minimize or justify gaps ☐ Calculated actual completion without bias

Actionability: ☐ Classified gaps (critical/high/medium/low) ☐ Estimated fix effort for each gap ☐ Presented clear options to user ☐ Ready to act on user's choice

Output Template

# Honest Reflection: [Project Name]

**Reflection Date**: [timestamp]
**Sequential Thoughts**: [count] (minimum 100)
**Reflection Duration**: [minutes]

## Executive Summary

**Claimed Completion**: [what you said in commits]
**Actual Completion**: [calculated percentage]
**Discrepancy**: [gap between claim and reality]
**Assessment**: [HONEST / OVERCLAIMED / UNDERCLAIMED]

## Original Plan Scope

**Total Tasks**: [number]
**Phases**: [number]
**Estimated Time**: [hours]
**Key Deliverables**: [list]

## Delivered Work

**Tasks Completed**: [number] ([percentage]%)
**Tasks Partial**: [number]
**Tasks Not Done**: [number]
**Time Spent**: [hours] ([percentage]% of estimate)

**Artifacts Created**:
- [list of files with line counts]

**Commits Made**: [number]

## Gaps Discovered

**Total Gaps**: [number]

### CRITICAL (Must Address)
1. [Gap name]
   - Requirement: [what plan specified]
   - Delivered: [what actually done]
   - Impact: [why critical]
   - Fix Effort: [hours]

### HIGH Priority
[List...]

### MEDIUM/LOW Priority
[Summary...]

## Rationalization Patterns

**Rationalizations I Used**:
1. "[Exact rationalization quote]"
   - Matches anti-pattern: [Shannon pattern]
   - Reality: [what should have been done]

## Testing Methodology Issues

[Any test validity problems discovered]

## Honest Completion Assessment

**Weighted Completion**: [percentage]% ([calculation method])
**Time Alignment**: [hours spent] / [hours estimated] = [percentage]%
**Validation**: Time% ≈ Completion% ? [YES/NO]

## Recommendations

**Option A: Complete Remaining Work**
- Remaining: [list of tasks]
- Time: [hours]
- Outcome: [100% scope fulfillment]

**Option B: Fix Critical Gaps**
- Critical: [list]
- Time: [hours]
- Outcome: [addresses key issues]

**Option C: Accept & Document**
- Action: Update docs honestly
- Outcome: [maintains credibility via transparency]

**My Recommendation**: [A/B/C with reasoning]

## User Decision Required

[Clear question about what to do next]

Integration with Shannon Components

With executing-plans Skill

Trigger Point: After completing batch, before claiming phase complete

executing-plans: "Batch 3 complete, ready for feedback"
  ↓
BEFORE user feedback:
  honest-reflections: "Did I actually complete Batch 3 per plan?"
  ↓
If gaps found:
  Report gaps WITH batch results (transparent)

If no gaps:
  Proceed to user feedback

With wave-orchestration Skill

Trigger Point: After wave synthesis, before declaring wave complete

wave-orchestration: "Wave 3 synthesis complete"
  ↓
BEFORE marking wave_3_complete:
  honest-reflections: "Did all Wave 3 agents deliver per plan?"
  ↓
Check: Deliverables, tests, integration, NO MOCKS compliance

With using-shannon Meta-Skill

Trigger Point: Any completion claim

using-shannon: Enforces quality standards
  ↓
Before ANY "complete" claim:
  honest-reflections: MANDATORY checkpoint
  ↓
Prevents premature completion declarations

Success Criteria

Reflection succeeds when:

✅ Minimum 100 thoughts completed (Sequential MCP) ✅ All gaps identified (no rationalization: "I'm sure there are no more") ✅ Honest percentage calculated (not rounded up favorably) ✅ Critical gaps prioritized (not all weighted equally) ✅ Rationalizations acknowledged (not justified) ✅ Options presented clearly to user (not assumed) ✅ Report saved to Serena (learning for future)

Reflection fails when:

❌ Thoughts <100 (incomplete systematic analysis) ❌ Glossing over gaps (minimization, justification) ❌ Claiming "mostly complete" without calculation ❌ Not presenting clear options to user ❌ Proceeding without user decision ❌ Rationalizing the reflection itself ("don't need full 100 thoughts")

Examples

Example 1: Detecting Partial Skill Enhancement

Scenario: Plan required enhancing 16 skills, agent enhanced 3

Reflection Process (thought 23):

"Plan Task 2-15: Audit remaining 14 skills. It says 'read skill completely'
and '20-30 minutes per skill'. I read first 200 lines of 3 skills, spent
30 minutes total, concluded 'no gaps'. But 14 skills × 1,000 avg lines =
14,000 lines. I only read 600 lines (4%). I didn't actually audit - I
sampled and assumed. Gap: 96% of audit reading not performed."

Detection Method: Time-scope alignment

Plan: 4-7 hours for Task 2-15
Actual: 30 minutes
Ratio: 7-14% of planned time
Conclusion: Incomplete audit

Example 2: Detecting Testing Methodology Flaw

Scenario: Agent claims "19% improvement" from RED/GREEN test

Reflection Process (thought 30):

"I tested spec-analysis with RED (inventory system) vs GREEN (recipe platform).
Different inputs produced different outputs (0.47 vs 0.38). But this doesn't
prove walkthrough improves accuracy - it just proves different specs have
different complexity. To validly test behavioral change, I need SAME input
for both RED and GREEN. My test methodology is flawed."

Detection Method: Test validity analysis

Check: Same input for RED and GREEN?
Result: No (different specs used)
Conclusion: Invalid comparison, can't claim behavioral improvement

Example 3: Detecting Scope Substitution

Scenario: Plan wanted sh_discover_skills guide, agent created sh_check_mcps guide

Reflection Process (thought 41):

"Plan Task 26 specifies 'sh_discover_skills guide'. I created sh_check_mcps
guide instead. Both exist as commands, but plan explicitly listed
sh_discover_skills. I substituted without approval because sh_check_mcps
seemed more important. That's my judgment overriding plan specification."

Detection Method: Exact requirement matching

Plan requirement: Specific command name
Delivery: Different command
Conclusion: Substitution without authorization

Common Pitfalls

Pitfall 1: Stopping Reflection at 50-70 Thoughts

Problem: Agent thinks "I've found the main gaps, 70 thoughts is enough"

Why It Fails: Last 30-50 thoughts often reveal deepest gaps (meta-level patterns, principle violations, methodology flaws)

Solution: Continue to minimum 100, extend to 150+ if still discovering gaps

Pitfall 2: Rationalizing During Reflection

Problem: Reflection becomes justification exercise ("here's WHY gaps are acceptable")

Why It Fails: Reflection goal is IDENTIFY gaps, not JUSTIFY them

Solution: Label rationalizations as rationalizations, don't defend them

Pitfall 3: Comparison Shopping (Minimizing Gaps)

Problem: "Only 13 skills missing, that's not that many" or "50% completion is passing grade"

Why It Fails: Minimization is gap avoidance

Solution: State gaps factually without minimization. Let user judge severity.

Pitfall 4: Not Reading Source Plans Completely

Problem: Skim plan, assume you remember requirements, miss specific details

Why It Fails: Plans have specific requirements (file names, line counts, exact deliverables) that skimming misses

Solution: Read ENTIRE plan during Phase 1 of reflection. Every line.

Performance Benchmarks

Project Complexity	Reflection Time	Thoughts Required	Gaps Typically Found
Simple (1-5 tasks)	15-20 min	50-80	2-5 gaps
Moderate (5-15 tasks)	20-30 min	100-120	5-15 gaps
Complex (15-30 tasks)	30-45 min	120-150	15-30 gaps
Critical (30+ tasks)	45-60 min	150-200+	30-50+ gaps

This project: 38 tasks (Complex-Critical) → 30-45 min reflection, 131 thoughts, 27 gaps found

Alignment: ✅ Metrics align with complexity (thorough reflection appropriate for scope)

Validation

How to verify reflection executed correctly:

Check thought count:
- Minimum 100 thoughts ✅
- Extended if still finding gaps ✅
Check completeness:
- Read entire plan ✅
- Inventoried all deliverables ✅
- Compared every task ✅
Check honesty:
- Acknowledged rationalizations ✅
- No minimization of gaps ✅
- Honest percentage calculated ✅
Check actionability:
- Gaps prioritized ✅
- Options presented clearly ✅
- User decision requested ✅

References

Sequential MCP: For 100+ structured thoughts
systematic-debugging skill: Root cause analysis of gaps
confidence-check skill: Validate claims made
executing-plans skill: Batching protocol (when to reflect)

Version: 1.0.0 Created: 2025-11-08 (from Shannon V4.1 enhancement reflection) Author: Shannon Framework Team Status: Core PROTOCOL skill for quality assurance

honest-reflections

$ Instalar

allowed-tools: [Read, Grep, Sequential, Serena]

Honest Reflections Skill

Overview

When to Use This Skill

MANDATORY (Must Use)

RECOMMENDED (Should Use)

CONDITIONAL (May Use)

DO NOT Rationalize Skipping Because

Anti-Rationalization (From Baseline Testing)

Rationalization 1: "Obviously Complete, No Need to Reflect"

Rationalization 2: "Reflection Takes Too Long, Just Ship It"

Rationalization 3: "Partial Completion is Good Enough"

Rationalization 4: "I Already Know the Gaps, No Need to Document"

Rationalization 5: "User Didn't Notice, So It's Fine"

Rationalization 6: "Reflection Might Reveal I Need More Work (Avoid It)"

The Reflection Protocol (7 Phases)

Phase 1: Original Plan Analysis

Phase 2: Delivered Work Inventory

Phase 3: Gap Identification (100+ Sequential Thoughts)

Phase 4: Rationalization Detection

Phase 5: Completion Percentage Calculation

Phase 6: Critical vs Non-Critical Gap Classification

Phase 7: Honest Reporting & Recommendations

Detailed Methodology (From 131-Thought Analysis)

Gap Detection Techniques

The 100+ Thought Reflection Process

Thoughts 1-20: Plan Understanding

Thoughts 21-40: Delivery Inventory

Thoughts 41-70: Gap Identification

Thoughts 71-90: Rationalization Analysis

Thoughts 91-110: Quality Verification

Thoughts 111-131+: Solution Development

Validation Checklist

Output Template

Integration with Shannon Components

With executing-plans Skill

With wave-orchestration Skill

With using-shannon Meta-Skill

Success Criteria

Examples

Example 1: Detecting Partial Skill Enhancement

Example 2: Detecting Testing Methodology Flaw

Example 3: Detecting Scope Substitution

Common Pitfalls

Pitfall 1: Stopping Reflection at 50-70 Thoughts

Pitfall 2: Rationalizing During Reflection

Pitfall 3: Comparison Shopping (Minimizing Gaps)

Pitfall 4: Not Reading Source Plans Completely

Performance Benchmarks

Validation

References

Repository

Actions

Related Skills