Set up monitoring, logging, and observability for applications and infrastructure. Use when implementing health checks, metrics collection, log aggregation, or alerting systems. Handles Prometheus, Grafana, ELK Stack, Datadog, and monitoring best practices.
86
82%
Does it follow best practices?
Impact
94%
1.14xAverage score across 3 eval scenarios
Passed
No known issues
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a well-crafted skill description that excels across all dimensions. It uses third person voice correctly, provides specific concrete actions, includes an explicit 'Use when...' clause with natural trigger terms, and names specific tools that create a distinct identity. The description effectively balances comprehensiveness with conciseness.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: 'monitoring, logging, and observability', 'health checks, metrics collection, log aggregation, alerting systems' - these are concrete, actionable capabilities. | 3 / 3 |
Completeness | Clearly answers both what ('Set up monitoring, logging, and observability...Handles Prometheus, Grafana, ELK Stack, Datadog') AND when ('Use when implementing health checks, metrics collection, log aggregation, or alerting systems'). | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural terms users would say: 'monitoring', 'logging', 'observability', 'health checks', 'metrics', 'alerting', plus specific tool names (Prometheus, Grafana, ELK Stack, Datadog) that users commonly reference. | 3 / 3 |
Distinctiveness Conflict Risk | Clear niche focused on observability/monitoring domain with distinct triggers (Prometheus, Grafana, ELK, Datadog, metrics, alerting) that are unlikely to conflict with other skills like general DevOps or deployment skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
64%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill provides highly actionable, production-ready code examples for monitoring setup with Prometheus, Grafana, and structured logging. Its main weaknesses are the lack of validation checkpoints in the workflow (critical for multi-system integrations) and the monolithic structure that could benefit from splitting detailed configurations into separate files. The empty example placeholders and metadata section add unnecessary bulk.
Suggestions
Add validation checkpoints after each step, e.g., 'Verify: curl localhost:3000/metrics should return Prometheus-formatted output' and 'Test alert: temporarily set threshold to 0 to confirm alerting pipeline works'
Move the detailed Grafana dashboard JSON and alert_rules.yml content to separate reference files (e.g., DASHBOARDS.md, ALERTS.md) and link from the main skill
Remove the empty 'Examples' section placeholders and the 'Metadata' section which adds no actionable value for Claude
Add a troubleshooting section for common issues like 'metrics not appearing in Prometheus' or 'alerts not firing'
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill provides substantial executable code which is valuable, but includes some unnecessary verbosity like the metadata section, empty example placeholders, and the 'When to use this skill' section that Claude could infer. The core content is useful but could be tightened. | 2 / 3 |
Actionability | Excellent executable code examples throughout - complete TypeScript instrumentation, Prometheus config, alert rules, Winston logging setup, Grafana dashboard JSON, and health check implementation. All examples are copy-paste ready with real configurations. | 3 / 3 |
Workflow Clarity | Steps are numbered and sequenced (Step 1-5), but there are no validation checkpoints or feedback loops. For a monitoring setup involving multiple interconnected systems (Prometheus, Grafana, alerting), there should be explicit verification steps like 'verify metrics endpoint returns data' or 'confirm alerts fire correctly in test mode'. | 2 / 3 |
Progressive Disclosure | Content is reasonably structured with clear sections, but the skill is monolithic at ~300 lines. The Grafana dashboard JSON and detailed alert rules could be split into separate reference files. References section links to external docs but doesn't leverage internal file organization. | 2 / 3 |
Total | 9 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
metadata_version | 'metadata.version' is missing | Warning |
Total | 10 / 11 Passed | |
c033769
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.