Name: observability-rca
Rating: 54.400000000000006 (1 reviews)
Author: elastic

observability-rca

Use this skill when performing root cause analysis on incidents detected by Elastic Observability. Activate when the user reports a production issue, outage, degraded performance, or asks to investigate alerts.

Quality

59%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Fix and improve this skill with Tessl

tessl review fix ./packages/opencode/src/elastic/skills/observability-rca/SKILL.md

Quality

Content

64%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The skill provides strong, actionable ES|QL queries that form a solid investigation toolkit for Elastic Observability incidents. However, it includes generic knowledge Claude already possesses (common root causes, how to write postmortems) and lacks validation checkpoints critical for incident investigation workflows—such as verifying hypotheses before declaring root cause or decision trees for when initial queries don't yield results.

Suggestions

Remove or significantly trim the 'Common Root Causes' table and 'Resolution Documentation' section—Claude already knows these patterns and how to write incident reports.

Add validation checkpoints between investigation steps, e.g., 'If error rate query returns no results, broaden the time window or check index patterns' and 'Verify root cause hypothesis by correlating at least two independent signals before concluding.'

Add decision branching: what to do when the initial scope assessment shows no errors, or when traces are unavailable for a service.

Dimension	Reasoning	Score
Conciseness	Mostly efficient with concrete queries, but the 'Common Root Causes' table and 'Resolution Documentation' section explain things Claude already knows (how to write incident reports, common infrastructure failure modes). The symptom-cause table is generic knowledge that doesn't earn its tokens.	2 / 3
Actionability	Provides fully executable ES\|QL queries for each investigation step, with specific field names, aggregations, and filters. The queries are copy-paste ready with clear placeholders for variable substitution.	3 / 3
Workflow Clarity	The 5-step investigation framework provides a clear sequence, but lacks validation checkpoints or feedback loops. There's no guidance on what to do if queries return no results, how to verify a hypothesis before declaring root cause, or when to escalate. For an investigation workflow involving production incidents, explicit decision points and verification steps are important.	2 / 3
Progressive Disclosure	Content is reasonably structured with clear sections, but everything is inline in a single file. The common root causes table and resolution documentation template could be separated into reference files. For a skill with no bundle files, the content is borderline monolithic at this length.	2 / 3
	Total	9 / 12 Passed

Description

54%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description excels at defining when to use the skill with rich, natural trigger terms, but it is notably weak on specifying what concrete actions the skill performs. It reads more like a trigger clause without a capability summary. Adding specific actions (e.g., querying Elasticsearch logs, analyzing APM traces, correlating metrics) would significantly improve it.

Suggestions

Add specific concrete actions the skill performs, e.g., 'Queries Elasticsearch logs, analyzes APM traces, correlates metrics across services, and inspects anomaly detection results to identify root causes.'

Mention Elastic-specific artifacts and tools (e.g., Kibana dashboards, APM traces, log indices, alerting rules) to improve both specificity and distinctiveness from generic incident analysis skills.

Use third-person voice ('Performs root cause analysis...') instead of the imperative 'Use this skill when...' framing to lead with capabilities before trigger conditions.

Dimension	Reasoning	Score
Specificity	The description mentions 'root cause analysis on incidents' but does not list any concrete actions (e.g., query logs, analyze traces, correlate metrics, inspect APM data). It stays at an abstract level without specifying what the skill actually does.	1 / 3
Completeness	The 'when' is explicitly and thoroughly covered with clear trigger scenarios. However, the 'what' is weak — it only says 'performing root cause analysis' without detailing the specific actions or capabilities the skill provides.	2 / 3
Trigger Term Quality	Includes strong natural trigger terms: 'production issue', 'outage', 'degraded performance', 'investigate alerts', 'root cause analysis', 'incidents', and 'Elastic Observability'. These are terms users would naturally use when seeking this kind of help.	3 / 3
Distinctiveness Conflict Risk	The mention of 'Elastic Observability' provides some distinctiveness, but 'root cause analysis' and 'production issue' are broad enough to overlap with other incident management or monitoring skills. Without specific actions tied to Elastic's tooling, conflict risk remains moderate.	2 / 3
	Total	8 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: elastic/elastic-ramen
Commit: 2e200ec

Reviewed: 23 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.