CtrlK
BlogDocsLog inGet started
Tessl Logo

built-in-metrics

Instrument an existing codebase with LaunchDarkly config tracking. Walks the four-tier ladder (managed runner → provider package → custom extractor + trackMetricsOf → raw manual) and picks the lowest-ceremony option that still captures duration, tokens, and success/error.

64

Quality

76%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./skills/agentcontrol/built-in-metrics/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Content

85%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-structured, highly actionable skill that provides a clear decision framework (the four-tier ladder) and concrete implementation guidance for instrumenting LaunchDarkly agent metrics. The workflow is well-sequenced with validation checkpoints, and progressive disclosure is handled effectively through provider-specific reference files. The main weakness is moderate verbosity — some sections could be tightened without losing clarity, particularly the tracker method table preamble and some repeated explanations of the generic pattern.

DimensionReasoningScore

Conciseness

The skill is fairly long but most content earns its place — the tier ladder, matrix, and method table are genuinely useful reference material. However, there's some redundancy (the generic shape explanation is repeated, and the tracker method table includes a preamble that restates things already covered). Some tightening is possible but it's not egregiously verbose.

2 / 3

Actionability

The skill provides concrete, specific guidance at every step: exact package names, exact method signatures in both Python and Node, a decision matrix with clear criteria, specific verification steps, and precise migration instructions for legacy API surfaces. The code patterns referenced are specific enough to be directly implementable.

3 / 3

Workflow Clarity

The workflow is clearly sequenced (explore → look up tier → implement from reference → verify) with explicit checklists at steps 1 and 4. Validation is thorough (check Monitoring tab, force an error, verify TTFT for streaming). Guardrails section provides explicit error-handling guidance and the feedback loop of 'if you're rewriting the call, you're at the wrong tier — drop down one' is excellent.

3 / 3

Progressive Disclosure

The skill is structured as an overview with clear one-level-deep references to provider-specific files (references/openai-tracking.md, references/langchain-tracking.md, etc.). The matrix table serves as a navigation hub pointing to the right reference file. Related skills are listed at the bottom for cross-navigation. However, no bundle files were provided to verify the references actually exist.

3 / 3

Total

11

/

12

Passed

Description

67%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description is technically detailed and highly specific to LaunchDarkly config tracking instrumentation, making it very distinctive. However, it lacks an explicit 'Use when...' clause, which limits completeness, and the trigger terms lean heavily technical rather than matching natural user language. The specificity of the four-tier approach and metric types is a strength.

Suggestions

Add an explicit 'Use when...' clause, e.g., 'Use when the user wants to add LaunchDarkly AI config tracking, feature flag metrics, or instrument LLM calls with LaunchDarkly.'

Include more natural trigger terms users might say, such as 'feature flags', 'AI config', 'LaunchDarkly SDK', 'LLM observability', or 'model metrics tracking'.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: instrumenting a codebase with LaunchDarkly config tracking, walking a four-tier ladder of specific approaches (managed runner, provider package, custom extractor + trackMetricsOf, raw manual), and capturing duration, tokens, and success/error metrics.

3 / 3

Completeness

The 'what' is well-covered (instrument codebase with LaunchDarkly config tracking, pick the right tier, capture metrics), but there is no explicit 'Use when...' clause or equivalent trigger guidance telling Claude when to select this skill.

2 / 3

Trigger Term Quality

Includes domain-specific terms like 'LaunchDarkly', 'config tracking', 'metrics', 'tokens', 'duration', and 'instrumentation', but these are fairly technical. Missing more natural user phrases like 'feature flags', 'feature flag metrics', 'AI config', or 'LaunchDarkly SDK'. Users might not naturally say 'four-tier ladder' or 'trackMetricsOf'.

2 / 3

Distinctiveness Conflict Risk

Highly distinctive due to the specific mention of LaunchDarkly, the four-tier ladder methodology, and the particular metrics (duration, tokens, success/error). Very unlikely to conflict with other skills.

3 / 3

Total

10

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository
launchdarkly/ai-tooling
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.