Name: jbvc/observability-engineer
Rating: 36.8 (1 reviews)
Author: jbvc

jbvc/observability-engineer

Build production-ready monitoring, logging, and tracing systems. Implements comprehensive observability strategies, SLI/SLO management, and incident response workflows. Use PROACTIVELY for monitoring infrastructure, performance optimization, or production reliability.

Quality

46%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Quality

Discovery

67%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description covers the core domain well and includes an explicit 'Use when' clause, which is a strength. However, the capabilities listed are somewhat high-level and could benefit from more concrete actions, and the trigger terms miss common user vocabulary like 'alerts', 'dashboards', 'metrics', or specific tool names. The 'performance optimization' trigger is broad enough to risk overlap with other skills.

Suggestions

Add more concrete actions such as 'configure alert rules, build dashboards, define SLO targets, create incident runbooks' to improve specificity.

Include additional natural trigger terms users would say, such as 'alerts', 'dashboards', 'metrics', 'Prometheus', 'Grafana', 'uptime', 'on-call', or 'APM'.

Dimension	Reasoning	Score
Specificity	Names the domain (monitoring, logging, tracing) and some actions (SLI/SLO management, incident response workflows), but the actions are described at a high level rather than listing multiple concrete operations like 'configure alerting rules, set up dashboards, define SLOs, create runbooks'.	2 / 3
Completeness	Clearly answers both 'what' (build monitoring/logging/tracing systems, SLI/SLO management, incident response) and 'when' with an explicit trigger clause ('Use PROACTIVELY for monitoring infrastructure, performance optimization, or production reliability').	3 / 3
Trigger Term Quality	Includes relevant terms like 'monitoring', 'logging', 'tracing', 'observability', 'SLI/SLO', 'incident response', and 'production reliability', but misses common user variations like 'alerts', 'dashboards', 'metrics', 'Prometheus', 'Grafana', 'on-call', or 'uptime'.	2 / 3
Distinctiveness Conflict Risk	The observability/monitoring niche is fairly distinct, but terms like 'performance optimization' and 'production reliability' are broad enough to potentially overlap with general DevOps, infrastructure, or performance tuning skills.	2 / 3
	Total	9 / 12 Passed

Implementation

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill reads as a persona description or job posting rather than actionable instructions for performing observability tasks. It consists almost entirely of bullet-point lists of tools, concepts, and capabilities that Claude already knows, with no executable code, concrete configurations, specific commands, or practical workflows. The four-line 'Instructions' section is far too vague to guide any real implementation work.

Suggestions

Replace the massive capability lists with concrete, executable examples: a sample Prometheus config, a PromQL query for SLO burn rate, an OpenTelemetry instrumentation snippet, or a Grafana dashboard JSON template.

Expand the 'Instructions' section into a detailed workflow with explicit validation checkpoints, e.g., 'After defining SLIs, validate with: check that each SLI has a measurable metric, a collection method, and a defined good/bad threshold.'

Move detailed tool-specific guidance into separate referenced files (e.g., PROMETHEUS.md, OTEL.md, ALERTING.md) and keep SKILL.md as a concise overview with clear navigation links.

Remove the 'Capabilities', 'Behavioral Traits', 'Knowledge Base', and 'Example Interactions' sections entirely—these describe what Claude already knows and waste context window tokens without adding actionable value.

Dimension	Reasoning	Score
Conciseness	Extremely verbose with massive capability lists, behavioral traits, knowledge base sections, and example interactions that Claude already knows. The bulk of the content is a resume-style enumeration of tools and concepts rather than actionable instructions. Most of this could be eliminated without losing any practical value.	1 / 3
Actionability	No concrete code, commands, configuration examples, or executable guidance anywhere. The entire skill is abstract descriptions and bullet-point lists of capabilities. The 'Instructions' section is four vague steps with no specifics on how to accomplish any of them.	1 / 3
Workflow Clarity	The 'Instructions' section has 4 high-level steps with no validation checkpoints, no error recovery, and no concrete sequencing. The 'Response Approach' section is similarly vague. For a skill involving production monitoring and potentially destructive operations, the complete absence of validation steps is a critical gap.	1 / 3
Progressive Disclosure	Monolithic wall of text with no references to external files. Hundreds of lines of capability lists are inlined that could be split into focused reference documents. No navigation structure or links to deeper resources for specific topics like SLO management or tracing setup.	1 / 3
	Total	4 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
metadata_version	'metadata.version' is missing	Warning

	Total	10 / 11 Passed

Reviewed

2 months ago

Table of Contents

Discovery Implementation Validation