CtrlK
BlogDocsLog inGet started
Tessl Logo

experiment

Design, run, and learn from experiments that test your riskiest assumptions. Handles the full experiment lifecycle — from designing the test to recording results to propagating what you learned back into the opportunity space.

47

Quality

51%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./discovery/skills/experiment/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

32%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description communicates a reasonable high-level concept around experiment design and learning but lacks the concrete specificity and explicit trigger guidance needed for reliable skill selection. It uses appropriate third-person voice but relies on abstract product-management terminology without grounding in natural user language or clear 'Use when' triggers.

Suggestions

Add an explicit 'Use when...' clause with natural trigger terms, e.g., 'Use when the user wants to test a hypothesis, validate an assumption, design an experiment, or run a prototype test.'

Include more concrete actions and outputs, e.g., 'Creates experiment briefs, defines success metrics, documents results in structured templates, and updates the opportunity solution tree with findings.'

Add natural keyword variations users might say, such as 'hypothesis', 'validate', 'A/B test', 'test plan', 'experiment results', to improve trigger term coverage.

DimensionReasoningScore

Specificity

The description names a domain (experiments/assumption testing) and mentions some actions (design, run, learn, recording results, propagating learnings), but the actions are fairly high-level and lack concrete specifics about what formats, tools, or outputs are involved.

2 / 3

Completeness

The description covers 'what' (design/run/learn from experiments) but has no explicit 'Use when...' clause or equivalent trigger guidance, which per the rubric should cap completeness at 2. Additionally, the 'when' is entirely missing — not even implied clearly — so it scores a 1.

1 / 3

Trigger Term Quality

Terms like 'experiments', 'assumptions', 'opportunity space' are somewhat relevant but lean toward product management jargon. Missing common natural variations users might say like 'hypothesis testing', 'validate idea', 'A/B test', 'prototype test', or 'user research'.

2 / 3

Distinctiveness Conflict Risk

The experiment lifecycle focus provides some distinctiveness, but terms like 'riskiest assumptions' and 'opportunity space' are broad enough to overlap with strategy, product discovery, or research skills. Without explicit trigger boundaries, conflict risk remains moderate.

2 / 3

Total

7

/

12

Passed

Implementation

70%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-structured instruction-only skill that clearly defines the experiment lifecycle workflow, transitions to other skills, and boundaries of responsibility. Its main weakness is moderate verbosity in framing sections and reliance on external references for concrete execution details, making the skill itself more of a philosophical guide than a hands-on playbook. The progressive disclosure and workflow clarity are strong points.

Suggestions

Trim the opening paragraph and 'Your Stance' section — Claude doesn't need motivational framing like 'You are the empirical engine' or 'This is where you earn your keep.'

Add a concrete example of a complete experiment record (even abbreviated) inline, showing the flow from assumption → experiment design → result → assumption update, rather than deferring all concrete details to referenced files.

DimensionReasoningScore

Conciseness

The content is mostly efficient but includes some unnecessary philosophical framing ('You are the empirical engine of the discovery system') and explanatory prose that Claude doesn't need. The 'Your Stance' section, while useful, could be tighter. The outcome propagation and transitions sections are well-structured but slightly verbose.

2 / 3

Actionability

The skill provides concrete guidance on success criteria and pre-commitment with good examples ('7 of 10 users complete the task without asking for help'), and clear instructions for outcome propagation. However, it lacks executable code/commands and relies heavily on references for the actual how-to (design, lifecycle tracking). The skill itself is more directional than copy-paste actionable.

2 / 3

Workflow Clarity

The experiment lifecycle is clearly sequenced: design with success criteria → pre-commit actions → run → record results → update assumption → review parent ideas → check shared assumptions → suggest next actions. The outcome propagation section provides an explicit multi-step process with validation-like checkpoints (review impact, check shared assumptions). Transitions to other skills are clearly defined with trigger conditions.

3 / 3

Progressive Disclosure

The skill provides a clear overview with well-signaled one-level-deep references: design principles in references/design-experiment.md, lifecycle tracking in references/experiment-lifecycle.md, schemas in experiment-record.md and assumption.md, and the artifacts skill for writing guidance. Content is appropriately split between the overview and referenced files.

3 / 3

Total

10

/

12

Passed

Validation

81%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation9 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

allowed_tools_field

'allowed-tools' contains unusual tool name(s)

Warning

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

9

/

11

Passed

Repository
audenaert/etak
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.