Guide deterministic runtime investigations in live environments using Lightrun MCP tools, with preflight gating, recovery/resume rules, evidence-first diagnosis, and explicit blocker/handoff outputs.
58
36%
Does it follow best practices?
Impact
99%
1.16xAverage score across 3 eval scenarios
Advisory
Suggest reviewing before use
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/lightrun-live-runtime-debugging/SKILL.mdQuality
Discovery
32%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description identifies its domain (Lightrun-based live debugging) but relies heavily on internal methodology jargon rather than concrete actions or natural user language. It lacks an explicit 'Use when...' clause, making it difficult for Claude to know when to select this skill. The description would benefit from listing specific actions and adding clear trigger conditions.
Suggestions
Add an explicit 'Use when...' clause with natural trigger terms like 'debug production', 'live debugging', 'Lightrun', 'add snapshot', 'runtime logs', 'production issue'.
Replace methodology jargon ('preflight gating', 'blocker/handoff outputs') with concrete actions like 'add dynamic logs to running services', 'capture snapshots without redeploying', 'inspect variable state in production'.
Include common file or tool references users might mention, such as 'Lightrun agents', '.java', '.py', or 'production environment debugging'.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (runtime investigations, live environments, Lightrun MCP tools) and mentions several concepts (preflight gating, recovery/resume rules, evidence-first diagnosis, blocker/handoff outputs), but these read more like abstract methodology labels than concrete user-facing actions. It doesn't list specific actions like 'add a snapshot', 'set a log', or 'capture metrics'. | 2 / 3 |
Completeness | The description addresses 'what' at a high level but completely lacks a 'Use when...' clause or any explicit trigger guidance for when Claude should select this skill. Per the rubric, a missing 'Use when...' clause caps completeness at 2, and the 'what' itself is also somewhat vague, warranting a score of 1. | 1 / 3 |
Trigger Term Quality | Includes 'Lightrun' and 'runtime investigations' which are relevant keywords, but terms like 'preflight gating', 'blocker/handoff outputs', and 'recovery/resume rules' are internal methodology jargon rather than natural phrases a user would say. Missing common user terms like 'debug', 'production debugging', 'live logging', 'snapshot', 'breakpoint'. | 2 / 3 |
Distinctiveness Conflict Risk | The mention of 'Lightrun MCP tools' provides some distinctiveness and narrows the domain, but the broader framing around 'runtime investigations in live environments' could overlap with general debugging or observability skills. The methodology terms don't help disambiguate from a user's perspective. | 2 / 3 |
Total | 7 / 12 Passed |
Implementation
39%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill demonstrates excellent workflow design with clear sequencing, validation gates, and error recovery paths for a complex runtime debugging process. However, it is severely over-engineered for a single SKILL.md file—the extreme verbosity and repetition across sections (async handling alone is covered in 5+ places) wastes token budget. The lack of concrete executable examples and the monolithic structure significantly reduce its practical effectiveness.
Suggestions
Reduce content by 50-60% by eliminating repeated rules across sections (e.g., consolidate all async handling into one section, remove the Runtime Quality Checklist which duplicates body rules).
Split detailed protocols (Async Runtime Action Protocol, Action Error Mitigation, Output Contract, Runtime Quality Checklist) into separate referenced files to improve progressive disclosure.
Add at least one concrete worked example showing a real investigation scenario with actual tool calls, hypothesis evaluation, and diagnosis output.
Remove explanatory rationale from rules (e.g., 'Do not issue final diagnosis while required async actions are still pending/running without checking status' can be shortened to a bullet point) to improve token efficiency.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is extremely verbose at ~300+ lines with extensive repetition across sections. Many rules are restated in slightly different forms (e.g., async handling is covered in Async Activation Gate, Async Runtime Action Protocol, Resume Criteria, Action Error Mitigation, and again in the Flow steps). Much of the content describes process logic that Claude can infer from a more compact specification. The Runtime Quality Checklist largely duplicates rules already stated in the body. | 1 / 3 |
Actionability | The skill provides a clear structured workflow with specific tool names (e.g., `lightrun__get_runtime_sources`) and concrete decision criteria (e.g., '2 consecutive no-hit synchronous probes'). However, there are no executable code examples, no concrete command-line invocations, and the investigation template is a blank skeleton rather than a worked example. The guidance is specific but not copy-paste ready. | 2 / 3 |
Workflow Clarity | The multi-step flow is clearly sequenced (15 numbered steps) with explicit success criteria for each step, validation checkpoints (preflight gate, async activation gate, cleanup gate), and feedback loops (retargeting after no-hit, strategy change after 2 low-information actions). Error recovery paths are well-defined with specific branching logic for sync vs async modes. | 3 / 3 |
Progressive Disclosure | The entire skill is a monolithic wall of text with no references to supporting files despite being ~300+ lines. Content like the Async Runtime Action Protocol, Action Error Mitigation details, and the full Output Contract could be split into separate reference files. There are no bundle files provided, and the skill makes no attempt to organize content across files for progressive discovery. | 1 / 3 |
Total | 7 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 10 / 11 Passed | |
c757387
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.