Name: built-in-metrics
Rating: 62.4 (1 reviews)
Author: launchdarkly

built-in-metrics

Instrument an existing codebase with LaunchDarkly config tracking. Walks the four-tier ladder (managed runner → provider package → custom extractor + trackMetricsOf → raw manual) and picks the lowest-ceremony option that still captures duration, tokens, and success/error.

Quality

72%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Fix and improve this skill with Tessl

tessl review fix ./skills/agentcontrol/built-in-metrics/SKILL.md

Quality

Content

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-structured, highly actionable skill that provides a clear decision framework (the four-tier ladder) and concrete implementation guidance for LaunchDarkly agent metrics instrumentation. Its main weakness is length — the tracker methods table and some explanatory passages could be trimmed or moved to reference files. The workflow is exemplary with explicit checklists, validation steps, and guardrails, though the progressive disclosure story is weakened by the absence of verifiable bundle files.

Suggestions

Move the full tracker methods table to a reference file (e.g., references/tracker-methods.md) and keep only a brief summary inline, reducing the SKILL.md body length significantly.

Trim explanatory asides like the runId paragraph and the 'Going lower looks flexible but costs you drift' framing — Claude doesn't need persuasion, just the rule.

Dimension	Reasoning	Score
Conciseness	The skill is fairly long and includes some explanatory context that Claude likely doesn't need (e.g., explaining why going lower on the tier ladder is bad, explaining what runId does in detail). However, most content is genuinely informative and specific to LaunchDarkly's API surface, which Claude wouldn't inherently know. The tracker methods table is borderline — useful as reference but adds significant length.	2 / 3
Actionability	The skill provides concrete package names, exact method signatures in both Python and Node, a clear decision matrix for tier selection, specific migration steps for pre-0.20 API surfaces, and precise verification steps. The checklist-driven workflow and provider/framework matrix give Claude everything needed to make implementation decisions and execute them.	3 / 3
Workflow Clarity	The four-step workflow (explore → look up tier → implement from reference → verify) is clearly sequenced with explicit checklists at steps 1 and 4. Validation is thorough: check Monitoring tab, force an error, verify TTFT for streaming. The guardrails section adds important error-handling constraints. The feedback loop for errors (force error → confirm count increments) is explicit.	3 / 3
Progressive Disclosure	The skill references multiple provider-specific reference files (e.g., references/openai-tracking.md, references/streaming-tracking.md) which is good progressive disclosure design, but no bundle files were provided to verify these exist. The SKILL.md itself is quite long (~200+ lines) and the full tracker methods table could arguably live in a reference file. The tier matrix and provider matrix are appropriately inline as they drive the core decision logic.	2 / 3
	Total	10 / 12 Passed

Description

67%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description is technically detailed and specific about what it does, clearly carving out a distinct niche around LaunchDarkly config tracking instrumentation. However, it lacks an explicit 'Use when...' clause and could benefit from more natural trigger terms that users would actually say when needing this skill. The internal jargon ('four-tier ladder', 'lowest-ceremony option') is informative but may not match typical user queries.

Suggestions

Add an explicit 'Use when...' clause, e.g., 'Use when the user wants to add LaunchDarkly AI config tracking, feature flag observability, or LLM metrics instrumentation to an existing project.'

Include more natural trigger terms users might say, such as 'feature flags', 'AI config', 'LLM observability', 'telemetry', or 'LaunchDarkly SDK integration'.

Dimension	Reasoning	Score
Specificity	Lists multiple specific concrete actions: instrumenting a codebase with LaunchDarkly config tracking, walking a four-tier ladder of specific approaches (managed runner, provider package, custom extractor + trackMetricsOf, raw manual), and capturing duration, tokens, and success/error metrics.	3 / 3
Completeness	Clearly answers 'what does this do' (instrument a codebase with LaunchDarkly config tracking using a tiered approach), but lacks an explicit 'Use when...' clause. The 'when' is only implied by the description of the task itself.	2 / 3
Trigger Term Quality	Includes some relevant keywords like 'LaunchDarkly', 'config tracking', 'duration', 'tokens', and specific API terms like 'trackMetricsOf', but misses common user-facing terms like 'feature flags', 'feature management', 'observability', 'telemetry', or 'AI config'. Users may not naturally say 'four-tier ladder' or 'lowest-ceremony option'.	2 / 3
Distinctiveness Conflict Risk	Highly distinctive due to the specific mention of LaunchDarkly, the four-tier ladder methodology, and the particular metrics (duration, tokens, success/error). This is unlikely to conflict with other skills given its narrow, well-defined niche.	3 / 3
	Total	10 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: launchdarkly/ai-tooling
Commit: 913b745

Reviewed: 10 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.