Agent skill for test-long-runner - invoke with $agent-test-long-runner
37
3%
Does it follow best practices?
Impact
96%
0.98xAverage score across 3 eval scenarios
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./.agents/skills/agent-test-long-runner/SKILL.mdQuality
Discovery
0%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This description is essentially a placeholder that provides no useful information about the skill's purpose, capabilities, or appropriate usage context. It only contains an invocation command and a vague label, making it impossible for Claude to determine when to select this skill from a list of available options.
Suggestions
Add a clear statement of what the skill does with concrete actions (e.g., 'Runs long-duration test suites, monitors test progress, and reports results').
Add an explicit 'Use when...' clause with natural trigger terms users would say (e.g., 'Use when the user needs to run extended test suites, long-running tests, or performance/endurance tests').
Replace the invocation instruction ('invoke with $agent-test-long-runner') with capability-focused language — invocation details belong in the skill body, not the description.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description provides no concrete actions whatsoever. 'Agent skill for test-long-runner' is entirely vague and does not describe what the skill actually does. | 1 / 3 |
Completeness | Neither 'what does this do' nor 'when should Claude use it' is answered. The description only states how to invoke it, not what it does or when it should be selected. | 1 / 3 |
Trigger Term Quality | The only terms present are 'test-long-runner' and '$agent-test-long-runner', which are internal identifiers rather than natural keywords a user would say. No natural language trigger terms are included. | 1 / 3 |
Distinctiveness Conflict Risk | The description is so generic that it provides no distinguishing information. 'Agent skill for test-long-runner' could be anything and offers no clear niche or distinct triggers. | 1 / 3 |
Total | 4 / 12 Passed |
Implementation
7%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill provides no actionable or novel information beyond what Claude already knows. It consists entirely of generic advice ('be thorough', 'take your time', 'document everything') and lists of broad capabilities that are inherent to Claude. There is nothing specific about handling long-running tasks, managing context over extended interactions, or any concrete methodology.
Suggestions
Replace generic advice with specific, actionable techniques for managing long-running tasks (e.g., how to checkpoint progress, how to structure output for tasks that span many tool calls, how to handle context window limits).
Add concrete examples with executable code or specific commands for at least one use case (e.g., a step-by-step codebase analysis workflow with specific tools and validation steps).
Define what makes this agent different from default Claude behavior - include specific strategies, tool usage patterns, or output templates that are unique to long-running task management.
Remove the Capabilities section entirely as it just restates general Claude abilities, and replace the Instructions section with a concrete workflow that includes validation checkpoints.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The content is almost entirely generic advice that Claude already knows ('Take Your Time', 'Be Thorough', 'Document Everything'). It explains nothing Claude doesn't already understand and adds no domain-specific knowledge. The capabilities list is just a restatement of general Claude abilities. | 1 / 3 |
Actionability | There are no concrete commands, code examples, specific tools, or executable guidance. Everything is vague direction like 'Deep dive into codebases' and 'Comprehensive research across multiple sources' without any specifics on how to actually do these things. | 1 / 3 |
Workflow Clarity | The numbered 'Instructions' are generic platitudes (Take Your Time, Be Thorough, Iterate) rather than a meaningful workflow. There are no concrete steps, no validation checkpoints, and no sequenced process for any task. | 1 / 3 |
Progressive Disclosure | The content has some structural organization with clear section headers (Capabilities, Instructions, Output Format, Example Use Cases), but there are no references to external files and the content itself is too thin to warrant splitting. The organization exists but serves little purpose given the lack of substantive content. | 2 / 3 |
Total | 5 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
398f7c2
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.