Define and implement Service Level Indicators (SLIs) and Service Level Objectives (SLOs) with error budgets and alerting. Use when establishing reliability targets, implementing SRE practices, or measuring service performance.
70
56%
Does it follow best practices?
Impact
92%
1.55xAverage score across 3 eval scenarios
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./plugins/observability-monitoring/skills/slo-implementation/SKILL.mdQuality
Discovery
89%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a solid description with explicit 'Use when' triggers and domain-specific terminology that makes it highly distinguishable. Its main weakness is that the 'what' portion could be more specific about concrete actions beyond 'define and implement'—for example, specifying dashboard creation, burn-rate alerting configuration, or SLO policy definition.
Suggestions
Expand the capability list with more concrete actions, e.g., 'create SLO dashboards, configure burn-rate alerts, calculate error budget consumption' to improve specificity.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (SLIs, SLOs, error budgets, alerting) and some actions ('define and implement'), but doesn't list multiple concrete actions in detail—e.g., it doesn't specify what 'implement' entails (dashboards, monitoring configs, burn-rate alerts, etc.). | 2 / 3 |
Completeness | Clearly answers both 'what' (define and implement SLIs/SLOs with error budgets and alerting) and 'when' (explicit 'Use when' clause covering reliability targets, SRE practices, and measuring service performance). | 3 / 3 |
Trigger Term Quality | Includes strong natural keywords users would say: 'SLIs', 'SLOs', 'error budgets', 'alerting', 'reliability targets', 'SRE practices', 'service performance'. These cover the main terms a user working in this domain would naturally use. | 3 / 3 |
Distinctiveness Conflict Risk | The SLI/SLO/error budget domain is a clear niche within SRE. The specific terminology (SLIs, SLOs, error budgets) makes it highly unlikely to conflict with general monitoring or alerting skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
22%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill is a comprehensive reference document on SLO implementation but suffers from significant verbosity, repeating the same PromQL patterns multiple times and explaining SRE concepts Claude already understands. It lacks a clear implementation workflow with validation steps, and references bundle files that don't exist. The actionable content (recording rules, alert rules) is decent but incomplete, with undefined metrics referenced in alerting rules.
Suggestions
Eliminate repeated PromQL queries—define each SLI query once in the recording rules section and reference them elsewhere instead of duplicating.
Remove explanatory content Claude already knows (SLI/SLO/SLA hierarchy, 'When to Use' list, 'Best Practices' generic advice, review process cadences) to cut the skill by ~40%.
Add a clear sequential workflow: 1) Define SLIs → 2) Validate queries return data → 3) Create recording rules → 4) Verify recording rules → 5) Set up alerts → 6) Test alerts with synthetic errors.
Either create the referenced bundle files (references/slo-definitions.md, references/error-budget.md) or remove the dead references, and define the missing burn_rate recording rules (burn_rate_1h, burn_rate_6h, burn_rate_30m) that the alerting rules depend on.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Significant verbosity throughout. The SLI/SLO/SLA hierarchy explanation, 'When to Use' bullet list, 'Choose Appropriate SLOs' considerations, and 'Best Practices' are all concepts Claude already knows well. The same PromQL availability/latency queries are repeated 3-4 times across different sections. The review process section and dashboard ASCII art add little actionable value. | 1 / 3 |
Actionability | Contains concrete PromQL queries and YAML configurations that are mostly executable, which is good. However, some queries reference metrics (burn_rate_1h, burn_rate_6h, burn_rate_30m) that are never defined in the recording rules, making the alerting rules incomplete. The dashboard section is descriptive rather than providing actual Grafana JSON or provisioning config. | 2 / 3 |
Workflow Clarity | There is no clear sequential workflow for implementing SLOs. The content reads as a reference document with sections, but lacks a step-by-step implementation process. There are no validation checkpoints—e.g., no guidance on verifying recording rules work before setting up alerts, or testing that SLI queries return expected values before committing configurations. | 1 / 3 |
Progressive Disclosure | References to `references/slo-definitions.md` and `references/error-budget.md` are mentioned but no bundle files exist, making these dead references. The content itself is quite long and monolithic—the Prometheus rules, alerting rules, and dashboard queries could be split into separate reference files. Related skills are mentioned at the end, which is good structure. | 2 / 3 |
Total | 6 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
34632bc
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.