🔧

Monitoring

153 skills in DevOps > Monitoring

observability-setup

Implements comprehensive observability with OpenTelemetry tracing, Prometheus metrics, and structured logging. Includes instrumentation plans, sample dashboards, and alert candidates. Use for "observability", "monitoring", "tracing", or "metrics".

patricio0312rev/skillset
2
0
更新日 7h ago

distributed-tracing-setup

Configure distributed tracing with Jaeger, Zipkin, or Datadog for microservices observability

Dexploarer/hyper-forge
2
1
更新日 7h ago

collecting-infrastructure-metrics

Marketplace

This skill enables Claude to collect comprehensive infrastructure performance metrics across compute, storage, network, containers, load balancers, and databases. It is triggered when the user requests "collect infrastructure metrics", "monitor server performance", "set up performance dashboards", or needs to analyze system resource utilization. The skill configures metrics collection, sets up aggregation, and helps create infrastructure dashboards for health monitoring and capacity tracking. It supports configuration for Prometheus, Datadog, and CloudWatch.

jeremylongshore/claude-code-plugins-nixtla
2
0
更新日 5h ago

otel-observability

Implement OpenTelemetry-based observability for traces, metrics, and logs. Use for distributed tracing, Prometheus metrics, structured logging, and agent monitoring. Triggers on "OpenTelemetry", "OTEL", "tracing", "metrics", "observability", "Prometheus", "Grafana", "distributed tracing", or when implementing spec/007-observability.md.

raphaelmansuy/k8s-agent-stack
2
0
更新日 5h ago

health-check-implementation

ヘルスチェックの設計・実装・監視の指針を提供するスキル。マイクロサービスの信頼性と観測性を確立するためのガイダンスを提供。Anchors:• Observability Engineering (Charity Majors) / 適用: ヘルスチェック設計の観測性原則 / 目的: 効果的なモニタリング指標の選定• Site Reliability Engineering (Google) / 適用: ヘルスチェックのレベル分類と段階的実装 / 目的: 運用負荷の最適化• Release It! (Michael T. Nygard) / 適用: 障害対応パターン / 目的: 自動回復とフェイルセーフ設計Trigger:Use when designing microservice health checks, implementing system reliability monitoring, establishing baseline metrics, or configuring alert thresholds.health check, liveness probe, readiness probe, monitoring, metrics, observability, kubernetes probes, circuit breaker

daishiman/AIWorkflowOrchestrator
2
0
更新日 5h ago

Unnamed Skill

Comprehensive logging and observability patterns for production systems. Use when implementing structured logging, distributed tracing, metrics collection, or alerting. Triggers: logging, logs, observability, tracing, metrics, OpenTelemetry, correlation ID, spans, alerts, monitoring, JSON logs.

cosmix/claude-code-setup
2
0
更新日 3h ago

linux-fundamentals-skill

Marketplace

Complete Linux administration skill covering process management, filesystem, permissions, package management, users, bash scripting, and system monitoring.

pluginagentmarketplace/custom-plugin-devops
2
0
更新日 3h ago

collecting-infrastructure-metrics

Marketplace

This skill enables Claude to collect comprehensive infrastructure performance metrics across compute, storage, network, containers, load balancers, and databases. It is triggered when the user requests "collect infrastructure metrics", "monitor server performance", "set up performance dashboards", or needs to analyze system resource utilization. The skill configures metrics collection, sets up aggregation, and helps create infrastructure dashboards for health monitoring and capacity tracking. It supports configuration for Prometheus, Datadog, and CloudWatch.

jeremylongshore/claude-code-plugins-nixtla
2
0
更新日 3h ago

Unnamed Skill

Prometheus monitoring and alerting for cloud-native observability. Use when implementing metrics collection, PromQL queries, alerting rules, or service discovery. Triggers: prometheus, promql, metrics, alertmanager, service discovery, recording rules, alerting, scrape config.

cosmix/claude-code-setup
2
0
更新日 1h ago

file-watcher-observability

ファイル監視システムの可観測性(Observability)を3本柱(Metrics、Logs、Traces)に基づいて実装するスキル。Prometheus/Grafana統合でSLA遵守測定、パフォーマンス監視、障害根本原因分析を支援。Anchors:• Observability Engineering(Charity Majors) / 適用: 3本柱設計 / 目的: メトリクス・ログ・トレースの統合• Google SRE Book / 適用: ゴールデンシグナル / 目的: SLI/SLO設計• Prometheus Documentation / 適用: メトリクス命名規則 / 目的: 標準準拠の実装Trigger:Use when implementing observability for file watcher systems, setting up Prometheus/Grafana monitoring, designing SLI/SLO metrics, or analyzing production performance issues.

daishiman/AIWorkflowOrchestrator
2
0
更新日 1h ago

observability

Marketplace

Distributed tracing with Jaeger, OpenTelemetry, and observability platforms for microservices insights

pluginagentmarketplace/custom-plugin-devops
2
0
更新日 1h ago

deploying-monitoring-stacks

Marketplace

This skill deploys monitoring stacks, including Prometheus, Grafana, and Datadog. It is used when the user needs to set up or configure monitoring infrastructure for applications or systems. The skill generates production-ready configurations, implements best practices, and supports multi-platform deployments. Use this when the user explicitly requests to deploy a monitoring stack, or mentions Prometheus, Grafana, or Datadog in the context of infrastructure setup.

jeremylongshore/claude-code-plugins-nixtla
2
0
更新日 5d ago

database-monitoring

データベース監視の設計・実装・検証を体系化するスキル。SQLite/Tursoの統計情報、スロークエリ、接続数、ストレージを含めた監視運用を整理する。Anchors:• Designing Data-Intensive Applications / 適用: 性能とスケーリング設計 / 目的: 監視メトリクスの根拠整理• Database Reliability Engineering / 適用: SREの監視設計 / 目的: 監視項目の体系化• Observability / 適用: 監視と診断 / 目的: 可観測性の向上Trigger:Use when designing database monitoring metrics, alert thresholds, SLI/SLOs, or operational dashboards for SQLite/Turso.database monitoring, sqlite metrics, turso monitoring, slow query, alerting, sli slo

daishiman/AIWorkflowOrchestrator
2
0
更新日 5d ago

grafana

Observability visualization with Grafana and the LGTM stack (Loki, Grafana, Tempo, Mimir). Use when implementing dashboards, log aggregation, distributed tracing, or metrics visualization. Triggers: grafana, loki, tempo, mimir, dashboard, logql, traceql, observability stack, LGTM.

cosmix/claude-code-setup
2
0
更新日 5d ago

collecting-infrastructure-metrics

Marketplace

This skill enables Claude to collect comprehensive infrastructure performance metrics across compute, storage, network, containers, load balancers, and databases. It is triggered when the user requests "collect infrastructure metrics", "monitor server performance", "set up performance dashboards", or needs to analyze system resource utilization. The skill configures metrics collection, sets up aggregation, and helps create infrastructure dashboards for health monitoring and capacity tracking. It supports configuration for Prometheus, Datadog, and CloudWatch.

jeremylongshore/claude-code-plugins-nixtla
2
0
更新日 5d ago

setting-up-log-aggregation

Marketplace

This skill sets up log aggregation solutions using ELK (Elasticsearch, Logstash, Kibana), Loki, or Splunk. It generates production-ready configurations and setup code based on specific requirements and infrastructure. Use this skill when the user requests to set up logging infrastructure, configure log aggregation, deploy ELK stack, deploy Loki, deploy Splunk, or needs help with observability. It is triggered by terms like "log aggregation," "ELK setup," "Loki configuration," "Splunk deployment," or similar requests for centralized logging solutions.

jeremylongshore/claude-code-plugins-nixtla
2
0
更新日 5d ago

creating-alerting-rules

Marketplace

This skill enables Claude to create intelligent alerting rules for proactive performance monitoring. It is triggered when the user requests to "create alerts", "define monitoring rules", or "set up alerting". The skill helps define thresholds, routing, and escalation policies, and offers options for multi-category alert creation, including latency, error rate, throughput, resource utilization, availability, and SLO violation alerts. It is useful for Site Reliability Engineers (SREs) and DevOps teams looking to improve system observability.

jeremylongshore/claude-code-plugins-nixtla
2
0
更新日 5d ago

observability-pillars

オブザーバビリティ三本柱(ログ・メトリクス・トレース)の統合設計スキル。相関IDによる連携と双方向ナビゲーション(メトリクス→ログ→トレース)を実現。Anchors:• Observability Engineering (Charity Majors) / 適用: 三本柱統合パターン / 目的: 高カーディナリティObservability• Google SRE Book / 適用: メトリクス設計とSLI/SLO / 目的: 信頼性エンジニアリング• W3C Trace Context / 適用: 分散トレーシング標準 / 目的: 相互運用可能なトレース伝播Trigger:Use when integrating logs, metrics, and traces with correlation IDs, designing bi-directional navigation between pillars, implementing OpenTelemetry, or setting up high-cardinality observability.observability, three pillars, logs, metrics, traces, correlation ID, OpenTelemetry, tracing, distributed systems

daishiman/AIWorkflowOrchestrator
2
0
更新日 5d ago

deploying-monitoring-stacks

Marketplace

This skill deploys monitoring stacks, including Prometheus, Grafana, and Datadog. It is used when the user needs to set up or configure monitoring infrastructure for applications or systems. The skill generates production-ready configurations, implements best practices, and supports multi-platform deployments. Use this when the user explicitly requests to deploy a monitoring stack, or mentions Prometheus, Grafana, or Datadog in the context of infrastructure setup.

jeremylongshore/claude-code-plugins-nixtla
2
0
更新日 5d ago

logging-observability

本番システム向け構造化ログとオブザーバビリティ設計スキル。Logs、Metrics、Tracesの3本柱を実装し、完全なシステム可視性を実現。Anchors:• 『Observability Engineering』(Charity Majors) / 適用: 高カーディナリティデバッグ / 目的: 根本原因分析• 『The Art of Monitoring』(James Turnbull) / 適用: メトリクス戦略 / 目的: 効果的アラート• Twelve-Factor App (Factor XI) / 適用: イベントストリームとしてのログ / 目的: クラウドネイティブロギング• OpenTelemetry Specification / 適用: 計装 / 目的: ベンダー中立オブザーバビリティTrigger:Use when implementing logging, setting up observability, designing metrics/alerting,integrating distributed tracing, or troubleshooting production systems.

daishiman/AIWorkflowOrchestrator
2
0
更新日 5d ago