CtrlK
BlogDocsLog inGet started
Tessl Logo

launchdarkly-metric-choose

Choose the right metrics for a LaunchDarkly experiment, guarded rollout, or release policy. Use when the user wants to know which metrics to use, which is the primary metric for an experiment, what guardrails to add, or which events to monitor in a rollout. Surfaces what will auto-attach from existing release policies before making additional recommendations.

93

1.16x
Quality

92%

Does it follow best practices?

Impact

92%

1.16x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

SKILL.md
Quality
Evals
Security

Quality

Content

85%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a strong advisory skill with excellent workflow clarity and actionability. It provides clear decision frameworks for three distinct contexts, includes concrete output formats, and surfaces important edge cases. The main weakness is moderate verbosity — some explanations could be tightened without losing clarity, particularly around the descriptions of each context type and repeated emphasis on event health.

DimensionReasoningScore

Conciseness

The skill is well-written but somewhat verbose for an advisory skill. Some sections could be tightened — e.g., the explanations of what guarded rollouts and experiments are, the repeated emphasis on event health, and the detailed tables for secondary metric types. However, it largely avoids explaining things Claude already knows and most content earns its place.

2 / 3

Actionability

The skill provides concrete MCP tool calls, specific output formats to present to users, clear decision criteria (e.g., 'fewer than 5 metrics', 'only recommend metrics with events flowing'), and explicit examples of formatted recommendations. The guidance is specific and directly executable despite being an advisory skill without code.

3 / 3

Workflow Clarity

The workflow is clearly sequenced across 5 steps with explicit branching by context type (experiment vs guarded rollout vs release policy). Validation is built in — checking event health before recommending, surfacing auto-attached metrics before adding new ones, and flagging at-risk metrics. The 'Important Context' section adds critical guardrails for error-prone scenarios like mid-experiment changes and context kind mismatches.

3 / 3

Progressive Disclosure

The skill is well-organized with clear sections, appropriate use of tables, and well-signaled references to related skills (metric-create, metric-instrument) at the end. For a standalone skill with no bundle files, the content is appropriately structured — the main workflow is in the body with advanced concerns in a separate 'Important Context' section.

3 / 3

Total

11

/

12

Passed

Description

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that clearly defines its scope (LaunchDarkly metric selection for experiments and rollouts), provides explicit trigger guidance via a 'Use when' clause with multiple natural trigger scenarios, and differentiates itself through domain-specific terminology. The mention of surfacing auto-attached metrics from release policies adds a concrete, distinctive capability that further reduces conflict risk.

DimensionReasoningScore

Specificity

Lists multiple concrete actions: choosing metrics, identifying primary metrics for experiments, adding guardrails, monitoring events in rollouts, and surfacing auto-attached metrics from existing release policies.

3 / 3

Completeness

Clearly answers both 'what' (choose the right metrics, surface auto-attached metrics, make recommendations) and 'when' (explicit 'Use when' clause covering multiple trigger scenarios like wanting to know which metrics to use, primary metric selection, guardrail addition, and event monitoring).

3 / 3

Trigger Term Quality

Includes strong natural trigger terms users would say: 'metrics', 'experiment', 'primary metric', 'guardrails', 'rollout', 'release policy', 'LaunchDarkly', 'events to monitor'. These cover the natural vocabulary a user would use when asking about experiment metrics.

3 / 3

Distinctiveness Conflict Risk

Highly distinctive with a clear niche: LaunchDarkly experiment/rollout metric selection. The combination of LaunchDarkly-specific terminology, metric selection focus, and release policy context makes it very unlikely to conflict with other skills.

3 / 3

Total

12

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository
launchdarkly/ai-tooling
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.