CtrlK
BlogDocsLog inGet started
Tessl Logo

launchdarkly-experiment-setup

Set up and run experiments in LaunchDarkly. Create experiments with metrics, treatments, and flag config, start iterations to collect data, swap design between iterations, and stop with a winner.

62

Quality

72%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Fix and improve this skill with Tessl

tessl review fix ./skills/experiments/launchdarkly-experiment-setup/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Content

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a strong, highly actionable skill with excellent workflow clarity and concrete JSON examples for every step of the experiment lifecycle. Its main weakness is moderate verbosity — the 'Core Concepts' section explains things Claude could infer, and some content could be more concise. The single-file structure is well-organized but would benefit from splitting reference material into supporting files.

Suggestions

Remove or drastically condense the 'What Are Experiments?' section — Claude doesn't need concept definitions when the workflow and JSON payloads already make the model clear.

Trim the introductory paragraph ('You're using a skill that guides you...') which restates the skill description without adding actionable value.

DimensionReasoningScore

Conciseness

The skill includes some unnecessary explanatory content (e.g., 'What Are Experiments?' section explains concepts Claude can infer from the API parameters, and the introductory paragraph restates the description). However, the bulk of the content is practical and reference-worthy. Could be tightened by ~30%.

2 / 3

Actionability

Provides fully concrete, copy-paste-ready JSON payloads for every step (create, start, evolve, stop). MCP tool names are explicit, required fields are shown in context, and optional fields are clearly enumerated with their purposes.

3 / 3

Workflow Clarity

The 7-step workflow is clearly sequenced with explicit validation (Step 5: verify status, treatments, and metrics). The lifecycle is well-defined with state transitions (not_started → running → stopped), and the mid-experiment evolution path includes a clear decision tree (light edits vs. real design changes). The stop step includes the non-obvious constraint about requiring a winner and how to handle inconclusive results.

3 / 3

Progressive Disclosure

The content is well-structured with clear sections and tables, but it's a monolithic ~200-line file with no bundle files to offload detail into. The edge cases table and optional fields could be split into reference files. However, since no bundle is provided, the inline approach is the only option, and the organization within the single file is reasonable.

2 / 3

Total

10

/

12

Passed

Description

67%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description does a strong job listing specific capabilities for LaunchDarkly experimentation and is clearly distinguishable from other skills. However, it lacks an explicit 'Use when...' clause which caps completeness, and could benefit from additional natural trigger terms like 'A/B testing' or 'feature flags' that users might commonly use.

Suggestions

Add an explicit 'Use when...' clause, e.g., 'Use when the user asks about LaunchDarkly experiments, A/B testing, or feature flag experimentation.'

Include common user-facing trigger terms like 'A/B testing', 'feature flags', 'feature experimentation', or 'experiment results' to improve discoverability.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: create experiments with metrics/treatments/flag config, start iterations, swap design between iterations, and stop with a winner. These are detailed, actionable capabilities.

3 / 3

Completeness

Clearly answers 'what does this do' with specific actions, but lacks an explicit 'Use when...' clause or equivalent trigger guidance. The when is only implied by the nature of the actions described.

2 / 3

Trigger Term Quality

Includes good terms like 'LaunchDarkly', 'experiments', 'metrics', 'treatments', 'flag config', and 'iterations', but misses common user variations like 'A/B testing', 'feature flags', 'experimentation platform', or 'feature experiments' that users might naturally say.

2 / 3

Distinctiveness Conflict Risk

Very clearly scoped to LaunchDarkly experimentation specifically, with distinct terminology (iterations, treatments, flag config) that is unlikely to conflict with other skills.

3 / 3

Total

10

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository
launchdarkly/ai-tooling
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.