🔧

監控

153 skills in DevOps > 監控

server-management

Server management principles and decision-making. Process management, monitoring strategy, and scaling decisions. Teaches thinking, not commands.

xenitV1/claude-code-maestro
62
15
更新於 3d ago

monitoring-observability

Implement comprehensive monitoring, logging, metrics, tracing, and alerting for production applications to ensure reliability and quick incident response. Use when setting up application monitoring, implementing structured logging, creating metrics and dashboards, setting up alerts, implementing distributed tracing, monitoring performance, tracking errors, or building observability into applications.

korallis/Droidz
49
6
更新於 3d ago

sentry-performance-monitoring

Marketplace

Use when setting up performance monitoring, distributed tracing, or profiling with Sentry. Covers transactions, spans, and performance insights.

TheBushidoCollective/han
47
5
更新於 3d ago

sre-monitoring-and-observability

Marketplace

Use when building comprehensive monitoring and observability systems.

TheBushidoCollective/han
47
5
更新於 3d ago

aws-cost-operations

Marketplace

This skill provides AWS cost optimization, monitoring, and operational best practices with integrated MCP servers for billing analysis, cost estimation, observability, and security assessment.

zxkane/aws-skills
40
7
更新於 3d ago

prometheus-monitoring

Set up Prometheus monitoring for applications with custom metrics, scraping configurations, and service discovery. Use when implementing time-series metrics collection, monitoring applications, or building observability infrastructure.

aj-geddes/useful-ai-prompts
25
1
更新於 3d ago

correlation-tracing

Implement distributed tracing with correlation IDs, trace propagation, and span tracking across microservices. Use when debugging distributed systems, monitoring request flows, or implementing observability.

aj-geddes/useful-ai-prompts
25
1
更新於 3d ago

log-aggregation

Implement centralized logging with ELK Stack, Loki, or Splunk for log collection, parsing, storage, and analysis across infrastructure.

aj-geddes/useful-ai-prompts
25
1
更新於 3d ago

dev-sre

Marketplace

Gate 2 of the development cycle. VALIDATES that observability was correctly implemented by developers. Does not implement observability code - only validates it.

LerianStudio/ring
25
1
更新於 3d ago

infrastructure-monitoring

Set up comprehensive infrastructure monitoring with Prometheus, Grafana, and alerting systems for metrics, health checks, and performance tracking.

aj-geddes/useful-ai-prompts
25
1
更新於 3d ago

qa-observability

Production observability and performance engineering with OpenTelemetry, distributed tracing, metrics, logging, SLO/SLI design, capacity planning, performance profiling, APM integration, and observability maturity progression for modern cloud-native systems.

vasilyu1983/AI-Agents-public
21
6
更新於 2d ago

performance-monitor

Expert performance monitor specializing in system-wide metrics collection, analysis, and optimization. Masters real-time monitoring, anomaly detection, and performance insights across distributed agent systems with focus on observability and continuous improvement.

zenobi-us/dotfiles
21
4
更新於 2d ago

site-reliability-engineer

Production monitoring, observability, SLO/SLI management, and incident response. Trigger terms: monitoring, observability, SRE, site reliability, alerting, incident response, SLO, SLI, error budget, Prometheus, Grafana, Datadog, New Relic, ELK stack, logs, metrics, traces, on-call, production monitoring, health checks, uptime, availability, dashboards, post-mortem, incident management, runbook. Completes SDD Stage 8 (Monitoring) with comprehensive production observability: - SLI/SLO definitions and tracking - Monitoring stack setup (Prometheus, Grafana, ELK, Datadog, etc.) - Alert rules and notification channels - Incident response runbooks - Observability dashboards (logs, metrics, traces) - Post-mortem templates and analysis - Health check endpoints - Error budget tracking Use when: user needs production monitoring, observability platform, alerting, SLOs, incident response, or post-deployment health tracking.

nahisaho/MUSUBI
19
2
更新於 2d ago

site-reliability-engineer

Production monitoring, observability, SLO/SLI management, and incident response. Trigger terms: monitoring, observability, SRE, site reliability, alerting, incident response, SLO, SLI, error budget, Prometheus, Grafana, Datadog, New Relic, ELK stack, logs, metrics, traces, on-call, production monitoring, health checks, uptime, availability, dashboards, post-mortem, incident management, runbook. Completes SDD Stage 8 (Monitoring) with comprehensive production observability: - SLI/SLO definitions and tracking - Monitoring stack setup (Prometheus, Grafana, ELK, Datadog, etc.) - Alert rules and notification channels - Incident response runbooks - Observability dashboards (logs, metrics, traces) - Post-mortem templates and analysis - Health check endpoints - Error budget tracking Use when: user needs production monitoring, observability platform, alerting, SLOs, incident response, or post-deployment health tracking.

nahisaho/MUSUBI
19
2
更新於 2d ago

grey-haven-observability-engineering

Marketplace

Production-ready monitoring, logging, and tracing using Prometheus, Grafana, OpenTelemetry, DataDog, and Sentry. Use when setting up production monitoring, implementing SLOs, distributed tracing, or performance tracking.

greyhaven-ai/claude-code-config
15
2
更新於 2d ago

Observability Instrumentation

Marketplace

Comprehensive observability methodology implementing three pillars (logs, metrics, traces) with structured logging using Go slog, Prometheus-style metrics, and distributed tracing patterns. Use when adding observability from scratch, logs unstructured or inadequate, no metrics collection, debugging production issues difficult, or need performance monitoring. Provides structured logging patterns (contextual logging, log levels DEBUG/INFO/WARN/ERROR, request ID propagation), metrics instrumentation (counter/gauge/histogram patterns, Prometheus exposition), tracing setup (span creation, context propagation, sampling strategies), and Go slog best practices (JSON formatting, attribute management, handler configuration). Validated in meta-cc with 23-46x speedup vs ad-hoc logging, 90-95% transferability across languages (slog specific to Go but patterns universal).

yaleh/meta-cc
15
1
更新於 2d ago

ln-367-observability-auditor

Marketplace

Observability audit worker (L3). Checks structured logging, health check endpoints, metrics collection, request tracing, log levels. Returns findings with severity, location, effort, recommendations.

levnikolaevich/claude-code-skills
13
1
更新於 2d ago

Unnamed Skill

Marketplace

Use when setting up monitoring systems, logging, metrics, tracing, or alerting. Invoke for dashboards, Prometheus/Grafana, load testing, profiling, capacity planning. Keywords: monitoring, observability, logging, metrics, tracing, alerting, Prometheus, Grafana.

Jeffallan/claude-skills
12
1
更新於 2d ago

monitoring

Monitoring standards for monitoring in Devops environments. Covers best

williamzujkowski/standards
11
0
更新於 2d ago

agent-performance-monitor

Expert performance monitor specializing in system-wide metrics collection, analysis, and optimization. Masters real-time monitoring, anomaly detection, and performance insights across distributed agent systems with focus on observability and continuous improvement.

Tony363/SuperClaude
10
0
更新於 2d ago