Marketplace
kubernetes-specialist
Design, operate, and troubleshoot Kubernetes clusters with guardrails
allowed_tools: Read, Write, Edit, Bash, Glob, Grep, Task, TodoWrite
model: sonnet
$ Installer
git clone https://github.com/DNYoussef/context-cascade /tmp/context-cascade && cp -r /tmp/context-cascade/skills/operations/cloud-platforms/kubernetes-specialist ~/.claude/skills/context-cascade// tip: Run this command in your terminal to install the skill
SKILL.md
name: kubernetes-specialist description: Design, operate, and troubleshoot Kubernetes clusters with guardrails allowed-tools: Read, Write, Edit, Bash, Glob, Grep, Task, TodoWrite model: sonnet x-version: 3.2.0 x-category: operations x-vcl-compliance: v3.1.1 x-cognitive-frames:
- HON
- MOR
- COM
- CLS
- EVD
- ASP
- SPC
STANDARD OPERATING PROCEDURE
Purpose
Deliver resilient Kubernetes clusters with clear RBAC, networking, autoscaling, and recovery patterns.
Trigger Conditions
- Positive: Kubernetes cluster setup or upgrade; Workload scheduling and capacity tuning; Kubernetes incident triage
- Negative: Cloud account governance (route to cloud-platforms); Application-level perf only (route to performance-analysis); Single-container packaging (route to docker-containerization)
Guardrails
- Structure-first: keep SKILL.md aligned with examples/, tests/, and any resources/references so downstream agents always have scaffolding.
- Adversarial validation is mandatory: cover boundary cases, failure paths, and rollback drills before declaring the SOP complete.
- Prompt hygiene: separate hard vs. soft vs. inferred constraints and confirm inferred constraints before acting.
- Explicit confidence ceilings: format as 'Confidence: X.XX (ceiling: TYPE Y.YY)' and never exceed the ceiling for the claim type.
- MCP traceability: tag sessions WHO=operations-{name}-{session_id}, WHY=skill-execution, and capture evidence links in outputs.
- Avoid anti-patterns: undocumented changes, missing rollback paths, skipped tests, or unbounded automation without approvals.
Required Artifacts
- SKILL.md (this SOP)
- metadata.json for registry details
Execution Phases
-
Assess cluster health
- Capture versions, control plane status, and add-ons
- Review policies: RBAC, OPA, network policies, quotas
- Map workloads, capacity, and SLOs
-
Plan topology and workloads
- Design namespace model, ingress/egress, storage classes
- Define deployment strategy (Helm/Kustomize) with approvals
- Set autoscaling policies and resource limits/requests
-
Execute changes
- Apply manifests or upgrades in staged environments
- Validate admission controls and runtime security
- Tune nodes, CNI, and observability hooks
-
Validate resilience
- Run health checks, conformance tests, and chaos/DR drills
- Verify backup/restore for critical data
- Document runbooks and escalation paths
Output Format
- Cluster profile with risks and dependencies
- Deployment plan with manifests/Helm references
- Operational controls (RBAC, quotas, policies) documented
- Validation results (conformance, performance, DR) with evidence
- Runbook updates and contact paths
Validation Checklist
- API/server health verified pre/post changes
- RBAC, network, and quota policies defined and applied
- Autoscaling and capacity checks completed
- Backup/restore or snapshot path identified
- Confidence ceiling stated for cluster readiness
Confidence: 0.70 (ceiling: inference 0.70) - Kubernetes SOP aligns to guardrails and staged validation
Repository

DNYoussef
Author
DNYoussef/context-cascade/skills/operations/cloud-platforms/kubernetes-specialist
8
Stars
2
Forks
Updated5d ago
Added1w ago