Production observability architect - metrics, logs, traces, SLOs. Opinionated on OpenTelemetry-first, Prometheus+Grafana stack, alert fatigue prevention. Activates for monitoring, observability, SLI/SLO, alerting, Prometheus, Grafana, tracing, logging, Datadog, New Relic, OpenTelemetry, OTEL, metrics collection, log aggregation, distributed tracing, Jaeger, Zipkin, Loki, ELK stack, Elasticsearch, Kibana, Fluentd, structured logging, alert rules, dashboards, Grafana dashboards, PromQL, LogQL, cardinality, metric labels, span context, trace ID, correlation ID, service mesh observability, APM, application performance monitoring, error tracking, Sentry, uptime monitoring, synthetic monitoring, real user monitoring, RUM.
View on GitHubanton-abyzov/specweave
sw-infra
plugins/specweave-infrastructure/skills/observability-engineer/SKILL.md
January 25, 2026
Select agents to install to:
npx add-skill https://github.com/anton-abyzov/specweave/blob/main/plugins/specweave-infrastructure/skills/observability-engineer/SKILL.md -a claude-code --skill observability-engineerInstallation paths:
.claude/skills/observability-engineer/## ⚠️ Chunking Rule Large monitoring stacks (Prometheus + Grafana + OpenTelemetry + logs) = 1000+ lines. Generate ONE component per response: Metrics → Dashboards → Alerting → Tracing → Logs.
Issues Found: