CtrlK
BlogDocsLog inGet started
Tessl Logo

launchdarkly-experiment-setup

Set up and run experiments in LaunchDarkly. Create experiments with metrics, treatments, and flag config, start iterations to collect data, swap design between iterations, and stop with a winner.

62

Quality

72%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./skills/experiments/launchdarkly-experiment-setup/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Content

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-structured, highly actionable skill with clear workflow sequencing and concrete JSON examples for every step. Its main weakness is moderate verbosity — the 'Core Concepts' section explains things Claude could infer, and the document is long enough that progressive disclosure into supporting files would improve token efficiency. The workflow clarity is strong with explicit verification steps and good handling of edge cases like mid-experiment design changes and inconclusive results.

Suggestions

Trim or remove the 'What Are Experiments?' section — Claude understands these concepts from the API parameters and tool descriptions; keep only non-obvious details like 'fallthrough rule id is the string "fallthrough"'.

Consider extracting the optional fields reference tables and edge cases into a separate REFERENCE.md to reduce the main skill's token footprint.

DimensionReasoningScore

Conciseness

The skill includes some unnecessary explanatory content (e.g., 'What Are Experiments?' section explains concepts Claude can infer from the API parameters), and the introductory paragraph restates the description. However, the workflow sections are reasonably efficient with concrete JSON examples.

2 / 3

Actionability

The skill provides fully executable JSON payloads for every step (create, start, evolve, stop), specific field names, concrete examples with realistic values, and clear parameter guidance. The edge cases table and 'What NOT to Do' section add practical, specific constraints.

3 / 3

Workflow Clarity

The lifecycle is clearly sequenced (Steps 1-7) with explicit validation in Step 5 (verify running status, check treatments/metrics). The mid-experiment evolution path (Step 6) includes a clear decision tree between light edits and real design changes. The stop step includes the non-obvious requirement of declaring a winner and how to handle inconclusive results.

3 / 3

Progressive Disclosure

The content is a single monolithic file with no references to supporting documents, which is acceptable given no bundle files exist. However, at ~200 lines with detailed JSON examples and tables, some content (e.g., the full create-experiment payload, optional fields reference, edge cases) could benefit from being split into separate files for better organization.

2 / 3

Total

10

/

12

Passed

Description

67%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description does a strong job listing specific capabilities within the LaunchDarkly experimentation domain and is highly distinctive. Its main weakness is the absence of an explicit 'Use when...' clause, which would help Claude know exactly when to select this skill. Adding common user-facing trigger terms like 'A/B testing' or 'feature flags' would also improve discoverability.

Suggestions

Add an explicit 'Use when...' clause, e.g., 'Use when the user asks about setting up, running, or managing experiments in LaunchDarkly.'

Include common alternative trigger terms users might say, such as 'A/B testing', 'feature flags', 'feature experimentation', or 'experiment results'.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: create experiments with metrics/treatments/flag config, start iterations, swap design between iterations, stop with a winner. These are detailed, actionable capabilities.

3 / 3

Completeness

Clearly answers 'what' with specific actions (create experiments, start iterations, swap design, stop with winner), but lacks an explicit 'Use when...' clause or equivalent trigger guidance for when Claude should select this skill.

2 / 3

Trigger Term Quality

Includes good terms like 'experiments', 'LaunchDarkly', 'metrics', 'treatments', 'flag config', 'iterations', but misses common user variations like 'A/B testing', 'feature flags', 'experimentation platform', or 'feature experiments'.

2 / 3

Distinctiveness Conflict Risk

Very clearly scoped to LaunchDarkly experimentation specifically. The combination of 'LaunchDarkly' + 'experiments' + domain-specific terms like 'iterations', 'treatments', and 'flag config' makes this highly distinctive and unlikely to conflict with other skills.

3 / 3

Total

10

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository
launchdarkly/ai-tooling
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.