CtrlK
BlogDocsLog inGet started
Tessl Logo

monitoring-observability

Monitoring and observability strategy, implementation, and troubleshooting. Use for designing metrics/logs/traces systems, setting up Prometheus/Grafana/Loki, creating alerts and dashboards, calculating SLOs and error budgets, analyzing performance issues, and comparing monitoring tools (Datadog, ELK, CloudWatch). Covers the Four Golden Signals, RED/USE methods, OpenTelemetry instrumentation, log aggregation patterns, and distributed tracing.

Install with Tessl CLI

npx tessl i github:ahmedasmar/devops-claude-skills --skill monitoring-observability
What are skills?

Overall
score

90%

Does it follow best practices?

Validation for skill structure

SKILL.md
Review
Evals

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that comprehensively covers the monitoring and observability domain. It uses third person voice correctly, lists specific tools and methodologies, includes natural trigger terms that practitioners would use, and clearly delineates when the skill should be activated. The description strikes a good balance between being comprehensive and remaining focused on its specific domain.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: 'designing metrics/logs/traces systems', 'setting up Prometheus/Grafana/Loki', 'creating alerts and dashboards', 'calculating SLOs and error budgets', 'analyzing performance issues', and 'comparing monitoring tools'. Also covers specific methodologies like Four Golden Signals, RED/USE methods, and OpenTelemetry instrumentation.

3 / 3

Completeness

Clearly answers both what (monitoring strategy, implementation, troubleshooting with specific tools and methods) and when ('Use for designing...', 'setting up...', 'creating...', 'calculating...', 'analyzing...', 'comparing...'). The 'Use for' clause provides explicit trigger guidance.

3 / 3

Trigger Term Quality

Excellent coverage of natural terms users would say: 'monitoring', 'observability', 'metrics', 'logs', 'traces', 'Prometheus', 'Grafana', 'Loki', 'alerts', 'dashboards', 'SLOs', 'error budgets', 'performance issues', 'Datadog', 'ELK', 'CloudWatch', 'OpenTelemetry', 'distributed tracing'. These are terms practitioners naturally use.

3 / 3

Distinctiveness Conflict Risk

Clear niche focused specifically on monitoring and observability with distinct triggers like Prometheus, Grafana, SLOs, Four Golden Signals, and OpenTelemetry. Unlikely to conflict with general DevOps or infrastructure skills due to the specific monitoring focus and tool names.

3 / 3

Total

12

/

12

Passed

Implementation

85%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a high-quality, comprehensive monitoring and observability skill with excellent actionability and structure. The decision tree, executable code examples, and clear references to deeper materials make it highly usable. The main weakness is some verbosity in explanatory sections that could be trimmed since Claude already understands basic observability concepts.

Suggestions

Remove explanatory text like 'SLI (Service Level Indicator): Measurement of service quality' and 'When to Use Tracing' sections - Claude knows these concepts; jump directly to the actionable guidance

Condense the 'When to use this skill' bullet list into the decision tree or remove it entirely since the decision tree already serves this purpose

DimensionReasoningScore

Conciseness

While the skill is comprehensive and well-organized, it includes some unnecessary explanations (e.g., defining what SLI/SLO/SLA are, explaining when to use tracing) that Claude already knows. The content could be tightened by removing introductory explanations and focusing purely on actionable guidance.

2 / 3

Actionability

Excellent actionability with executable PromQL queries, complete bash commands for scripts, working Python code examples, and copy-paste ready configurations. Every section provides concrete, runnable examples rather than abstract descriptions.

3 / 3

Workflow Clarity

Outstanding workflow clarity with a decision tree at the start, numbered troubleshooting workflows, clear migration phases with timelines, and explicit validation steps (e.g., 'Checks for' lists after script commands). The structure guides users through complex multi-step processes effectively.

3 / 3

Progressive Disclosure

Excellent progressive disclosure with clear overview sections followed by '→ Read' and '→ Script' references to deeper materials. References are one level deep, clearly signaled with arrows, and organized by type (Scripts, References, Templates) in the summary section.

3 / 3

Total

11

/

12

Passed

Validation

75%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation12 / 16 Passed

Validation for skill structure

CriteriaDescriptionResult

skill_md_line_count

SKILL.md is long (870 lines); consider splitting into references/ and linking

Warning

description_trigger_hint

Description may be missing an explicit 'when to use' trigger hint (e.g., 'Use when...')

Warning

metadata_version

'metadata' field is not a dictionary

Warning

license_field

'license' field is missing

Warning

Total

12

/

16

Passed

Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.