context-optimizer
Second-pass context optimization that analyzes user prompts and removes irrelevant specs, agents, and skills from loaded context. Achieves 80%+ token reduction through smart cleanup. Activates for optimize context, reduce tokens, clean context, smart context, precision loading.
$ Installer
git clone https://github.com/anton-abyzov/specweave /tmp/specweave && cp -r /tmp/specweave/plugins/specweave/skills/context-optimizer ~/.claude/skills/specweave// tip: Run this command in your terminal to install the skill
name: context-optimizer description: Second-pass context optimization that analyzes user prompts and removes irrelevant specs, agents, and skills from loaded context. Achieves 80%+ token reduction through smart cleanup. Activates for optimize context, reduce tokens, clean context, smart context, precision loading. allowed-tools: Read, Grep, Glob
Context Optimizer
Second-pass context optimization that analyzes user intent and surgically removes irrelevant content from loaded context, achieving 80%+ total token reduction.
Purpose
After context-loader loads context based on manifest (70% reduction), context-optimizer performs intelligent analysis of the user's specific prompt to remove sections that aren't needed for that particular task.
The Two-Pass Strategy
Pass 1: Context Loader (Manifest-Based)
# context-manifest.yaml
spec_sections:
- auth-spec.md
- payment-spec.md
- user-management-spec.md
Result: Load only relevant specs (70% reduction)
Before: 150k tokens → After: 45k tokens
Pass 2: Context Optimizer (Intent-Based)
User: "Fix authentication bug in login endpoint"
Analyzer detects:
• Task type: Bug fix (not new feature)
• Domain: Backend auth
• Scope: Single endpoint
Removes:
❌ payment-spec.md (different domain)
❌ user-management-spec.md (different domain)
❌ PM agent description (not needed for bug fix)
❌ Frontend skills (backend task)
❌ DevOps skills (not deploying)
Keeps:
✅ auth-spec.md (directly relevant)
✅ architecture/security/ (auth considerations)
✅ nodejs-backend skill (implementation)
✅ Tech Lead agent (code review)
Result: Additional 40% reduction
After Pass 1: 45k tokens → After Pass 2: 27k tokens
Total reduction: 82% (150k → 27k)
When to Use
Activates automatically after context-loader when:
- User prompt is specific (mentions feature, bug, file)
- Loaded context > 20k tokens
- Task is focused (not "build full product")
Manual activation:
- "optimize context"
- "reduce tokens"
- "clean context"
Skip when:
- Context already small (<10k tokens)
- User asks broad questions ("explain architecture")
- Planning new features (need full context)
What It Does
1. User Intent Analysis
interface IntentAnalysis {
task_type: TaskType;
domains: Domain[];
scope: Scope;
needs_full_context: boolean;
confidence: number;
}
enum TaskType {
BUG_FIX = "bug-fix", // Narrow scope
FEATURE = "feature", // Medium scope
REFACTOR = "refactor", // Medium scope
ARCHITECTURE = "architecture", // Broad scope
DOCUMENTATION = "documentation", // Medium scope
TESTING = "testing" // Medium scope
}
enum Domain {
FRONTEND = "frontend",
BACKEND = "backend",
DATABASE = "database",
INFRASTRUCTURE = "infrastructure",
SECURITY = "security",
AUTH = "auth",
PAYMENT = "payment",
// ... project-specific domains
}
enum Scope {
NARROW = "narrow", // Single file/function
FOCUSED = "focused", // Single module
BROAD = "broad" // Multiple modules
}
Analysis Examples:
| User Prompt | Task Type | Domains | Scope | Needs Full? |
|---|---|---|---|---|
| "Fix login bug" | BUG_FIX | [AUTH, BACKEND] | NARROW | No |
| "Add payment feature" | FEATURE | [PAYMENT, BACKEND] | FOCUSED | No |
| "Refactor auth module" | REFACTOR | [AUTH, BACKEND] | FOCUSED | No |
| "Design system architecture" | ARCHITECTURE | [ALL] | BROAD | Yes |
| "Explain how payments work" | DOCUMENTATION | [PAYMENT] | FOCUSED | No |
2. Context Filtering Rules
rules:
# Rule 1: Task-Specific Specs
bug_fix:
keep_specs:
- Related to mentioned domain
- Architecture docs for that domain
remove_specs:
- Unrelated domains
- Strategic docs (PRD, business specs)
- Future roadmap
feature_development:
keep_specs:
- Related domain specs
- Architecture for integration points
- Related ADRs
remove_specs:
- Unrelated domains
- Completed features (unless mentioned)
architecture_review:
keep_specs:
- ALL (needs full context)
# Rule 2: Agent/Skill Filtering
backend_task:
keep_skills:
- Backend skills (nodejs, python, dotnet)
- Tech Lead
- QA Lead
remove_skills:
- Frontend skills
- DevOps (unless "deploy" mentioned)
- PM agent (unless "requirements" mentioned)
frontend_task:
keep_skills:
- Frontend skills (React, Next.js)
- UI/UX skills
remove_skills:
- Backend skills
- Database skills
# Rule 3: Documentation Filtering
implementation_task:
keep_docs:
- Technical specs (HLD, LLD)
- ADRs
- Implementation guides
remove_docs:
- Strategic docs (PRD, business cases)
- Operations runbooks
- Deployment guides
planning_task:
keep_docs:
- Strategic docs (PRD)
- Architecture overview
- ADRs
remove_docs:
- Implementation details
- Code comments
- Test cases
3. Optimization Algorithm
async function optimizeContext(
userPrompt: string,
loadedContext: Context
): Promise<OptimizedContext> {
// Step 1: Analyze intent
const intent = await analyzeIntent(userPrompt);
// Step 2: If broad scope, keep all
if (intent.needs_full_context) {
return {
context: loadedContext,
removed: [],
kept: Object.keys(loadedContext),
reason: "Broad scope requires full context"
};
}
// Step 3: Apply filtering rules
const filtered = {
specs: filterByDomain(loadedContext.specs, intent.domains),
agents: filterByTaskType(loadedContext.agents, intent.task_type),
skills: filterByDomain(loadedContext.skills, intent.domains),
docs: filterByScope(loadedContext.docs, intent.scope)
};
// Step 4: Calculate savings
const before = calculateTokens(loadedContext);
const after = calculateTokens(filtered);
const savings = ((before - after) / before * 100).toFixed(0);
// Step 5: Return optimized context
return {
context: filtered,
removed: diff(loadedContext, filtered),
kept: Object.keys(filtered),
savings: `${savings}%`,
tokens_before: before,
tokens_after: after
};
}
Usage Examples
Example 1: Bug Fix (Narrow Scope)
# Pass 1: context-loader loads from manifest
Loaded context: 45k tokens (auth, payment, user specs)
# User prompt
User: "Fix bug where login fails with expired JWT"
# Pass 2: context-optimizer analyzes
🔍 Analyzing task requirements...
Detected:
Task Type: Bug Fix
Domains: [AUTH, BACKEND]
Scope: Narrow (single endpoint)
Full Context Needed: No
Optimizing context...
Removed (18k tokens):
❌ payment-spec.md (9k tokens) - Unrelated domain
❌ user-management-spec.md (7k tokens) - Unrelated domain
❌ PM agent description (1k tokens) - Not needed for bug fix
❌ Frontend skills (1k tokens) - Backend task
Kept (27k tokens):
✅ auth-spec.md (12k tokens) - Core domain
✅ architecture/security/jwt-handling.md (5k tokens) - Relevant
✅ nodejs-backend skill (8k tokens) - Implementation
✅ Tech Lead agent (2k tokens) - Code review
Result: 45k → 27k tokens (40% additional reduction)
Total: 150k → 27k (82% total reduction)
Ready to proceed with optimized context.
Example 2: Feature Development (Focused Scope)
User: "Add subscription billing to payment module"
🔍 Analyzing task requirements...
Detected:
Task Type: Feature Development
Domains: [PAYMENT, BACKEND]
Scope: Focused (single module)
Full Context Needed: No
Optimizing context...
Removed (15k tokens):
❌ auth-spec.md (12k tokens) - Unrelated domain
❌ user-management-spec.md (7k tokens) - Unrelated
❌ DevOps agent (2k tokens) - Not deploying yet
Kept (30k tokens):
✅ payment-spec.md (9k tokens) - Core domain
✅ architecture/payment-integration.md (6k tokens) - Integration points
✅ architecture/adr/0015-payment-provider.md (3k tokens) - Context
✅ PM agent (2k tokens) - Requirements clarification
✅ nodejs-backend skill (8k tokens) - Implementation
✅ Tech Lead agent (2k tokens) - Planning
Result: 45k → 30k tokens (33% additional reduction)
Example 3: Architecture Review (Broad Scope)
User: "Review overall system architecture"
🔍 Analyzing task requirements...
Detected:
Task Type: Architecture Review
Domains: [ALL]
Scope: Broad (system-wide)
Full Context Needed: Yes
Skipping optimization - broad scope requires full context.
Loaded context: 45k tokens (all specs retained)
Rationale: Architecture review needs visibility across all domains
to identify integration issues, dependencies, and design patterns.
Example 4: Manual Optimization
User: "Optimize context for payment work"
context-optimizer:
🔍 Analyzing for payment domain...
Removed (25k tokens):
❌ auth-spec.md
❌ user-management-spec.md
❌ Frontend skills
❌ Strategic docs
Kept (20k tokens):
✅ payment-spec.md
✅ Payment architecture
✅ Backend skills
✅ Integration guides
Result: 45k → 20k tokens (56% reduction)
You can now work on payment features with optimized context.
Configuration
Integration with Context Loader
Workflow
// 1. User asks to work on feature
User: "Fix authentication bug"
// 2. context-loader loads from manifest
context-loader.load({
increment: "0001-authentication",
manifest: "context-manifest.yaml"
})
// Result: 150k → 45k tokens (70% reduction)
// 3. context-optimizer analyzes user prompt
context-optimizer.analyze(userPrompt: "Fix authentication bug")
// Detects: bug-fix, auth domain, narrow scope
// 4. context-optimizer removes unneeded sections
context-optimizer.filter(loadedContext, analysis)
// Result: 45k → 27k tokens (40% additional reduction)
// 5. Return optimized context to main session
return optimizedContext
// Total: 150k → 27k (82% reduction)
Configuration in Increment
# .specweave/increments/0001-auth/context-manifest.yaml
spec_sections:
- .specweave/docs/internal/strategy/auth/spec.md
- .specweave/docs/internal/strategy/payment/spec.md
- .specweave/docs/internal/strategy/users/spec.md
documentation:
- .specweave/docs/internal/architecture/auth-design.md
- .specweave/docs/internal/architecture/payment-integration.md
max_context_tokens: 50000
# NEW: Optimization hints
optimization:
domains:
auth: ["auth-spec.md", "auth-design.md"]
payment: ["payment/spec.md", "payment-integration.md"]
users: ["users/spec.md"]
# Suggest which domains to keep for common tasks
task_hints:
"login": ["auth"]
"payment": ["payment"]
"billing": ["payment"]
"user profile": ["users", "auth"]
Token Savings Examples
Realistic Project (500-page spec)
Without SpecWeave:
- Full spec loaded: 500 pages × 300 tokens = 150,000 tokens
- Every query uses 150k tokens
- Cost: $0.015 × 150 = $2.25 per query
With Context Loader (Pass 1):
- Manifest loads only auth section: 50 pages = 15,000 tokens (90% reduction)
- Cost: $0.015 × 15 = $0.225 per query
With Context Optimizer (Pass 2):
- Further refine to login endpoint: 30 pages = 9,000 tokens (94% total reduction)
- Cost: $0.015 × 9 = $0.135 per query
Savings: $2.25 → $0.135 (94% cost reduction)
Session Example (10 queries)
Scenario: Fix 3 auth bugs, 2 payment bugs, 1 user bug
| Query | Without | Pass 1 | Pass 2 | Savings |
|---|---|---|---|---|
| Auth bug 1 | 150k | 45k (auth+pay+user) | 27k (auth only) | 82% |
| Auth bug 2 | 150k | 45k | 27k | 82% |
| Auth bug 3 | 150k | 45k | 27k | 82% |
| Payment bug 1 | 150k | 45k | 28k (payment only) | 81% |
| Payment bug 2 | 150k | 45k | 28k | 81% |
| User bug 1 | 150k | 45k | 30k (user only) | 80% |
Total tokens:
- Without: 900k tokens
- Pass 1 only: 270k tokens (70% reduction)
- Pass 2: 167k tokens (81% reduction)
Cost savings:
- Without: $13.50
- Pass 1 only: $4.05
- Pass 2: $2.50
Additional savings: $1.55 per session (38% on top of Pass 1)
Best Practices
1. Let It Run Automatically
Default mode: auto-optimize after context-loader
- No manual intervention
- Adapts to each query
- Restores full context if needed
2. Review Removals for Critical Tasks
For production deploys, security reviews:
User: "Review security before deployment"
context-optimizer:
⚠️ Keeping full context (critical task detected)
3. Use Conservative Buffer for Complex Tasks
buffer_strategy: "conservative"
- Keeps adjacent domains
- Includes integration points
- Safer for refactoring
4. Custom Domains for Your Project
custom_domains:
- "payment-processing"
- "real-time-notifications"
- "analytics-pipeline"
Helps optimizer understand your project structure.
5. Monitor Optimization Accuracy
If optimizer removes needed context:
- Lower
min_confidencethreshold - Add
always_keeprules - Use
conservativebuffer
Limitations
What context-optimizer CAN'T do:
- ❌ Predict future conversation needs (only analyzes current prompt)
- ❌ Understand implicit domain relationships (unless configured)
- ❌ Read your mind (if prompt is vague, keeps more context)
What context-optimizer CAN do:
- ✅ Analyze task type and domain from prompt
- ✅ Remove obviously unrelated specs/agents
- ✅ Restore removed context if later needed
- ✅ Learn from always_keep/custom_domains config
Test Cases
TC-001: Bug Fix Optimization
Given: Context with auth+payment+user specs (45k tokens) When: User says "Fix login bug" Then: Keeps only auth spec (27k tokens, 40% reduction)
TC-002: Feature Development
Given: Context with multiple domains When: User says "Add subscription billing" Then: Keeps payment + integration specs (33% reduction)
TC-003: Architecture Review (Broad)
Given: Context with all specs When: User says "Review architecture" Then: Keeps all specs (0% reduction, full context needed)
TC-004: Vague Prompt
Given: Context with multiple specs When: User says "Help me" Then: Keeps all (low confidence, plays safe)
TC-005: Manual Domain Specification
Given: Context with all specs When: User says "Optimize for payment work" Then: Keeps only payment domain (50%+ reduction)
Future Enhancements
Phase 2: Conversation History Analysis
- Track which context was actually used
- Remove sections never referenced
- Learn user patterns
Phase 3: Dynamic Context Expansion
- Start with minimal context
- Add sections on-demand when mentioned
- "Just-in-time" context loading
Phase 4: Cross-Increment Context
- Detect dependencies across increments
- Load context from multiple increments intelligently
- Maintain coherence across features
Resources
- Retrieval-Augmented Generation (RAG) - Context retrieval patterns
- LongRAG: Large Context Optimization - Long context handling
- Anthropic Context Windows - Best practices
Summary
context-optimizer provides second-pass context optimization:
✅ Intent-driven filtering (analyzes user prompt) ✅ Domain-aware (removes unrelated specs) ✅ Task-type specific (bug fix vs feature vs architecture) ✅ 80%+ total reduction (on top of context-loader's 70%) ✅ Automatic (runs after context-loader) ✅ Safe (restores context if needed) ✅ Configurable (custom domains, buffer strategy)
Use it when: Working with large specs (500+ pages) where even manifest-based loading results in 30k+ tokens.
Skip it when: Context already small (<10k), broad architectural questions, or planning new features from scratch.
The result: From 150k tokens → 27k tokens = 82% total reduction, enabling work on enterprise-scale specs within Claude's context window.
Repository
