agent-test-long-runner

Agent skill for test-long-runner - invoke with $agent-test-long-runner

0.98x

Quality

Does it follow best practices?

Impact

96%

0.98x

Average score across 3 eval scenarios

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./.agents/skills/agent-test-long-runner/SKILL.md

Quality

Content

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill is essentially a collection of generic motivational advice with no actionable, concrete, or novel content. It tells Claude to 'be thorough' and 'take your time' without providing any specific tools, commands, workflows, or techniques for handling long-running tasks. The skill adds no value beyond what Claude already knows how to do.

Suggestions

Replace vague instructions with concrete, actionable guidance — e.g., specific strategies for breaking large tasks into checkpointed subtasks, how to write progress updates, or how to structure intermediate outputs.

Add executable examples: show a concrete workflow for at least one use case (e.g., a codebase analysis) with specific steps, tool invocations, and expected outputs.

Define what 'long-running' means operationally — include concrete techniques for managing state, checkpointing progress, and recovering from interruptions.

Remove the capabilities list and example use cases sections entirely — they describe things Claude already knows and waste token budget.

Dimension	Reasoning	Score
Conciseness	The content is padded with vague, generic advice that Claude already knows ('Take Your Time', 'Be Thorough', 'Document Everything'). The capabilities list, example use cases, and output format sections are all things Claude inherently understands and add no novel information. Nearly every token is wasted.	1 / 3
Actionability	There are no concrete commands, executable code, specific tools, or actionable steps. Every instruction is abstract and vague ('Deep dive into codebases', 'Continuously improve and refine your work'). Nothing here tells Claude what to actually do differently.	1 / 3
Workflow Clarity	The numbered 'Instructions' are generic platitudes, not a workflow. There is no sequencing of steps, no validation checkpoints, and no concrete process for handling long-running tasks. The skill doesn't define what makes a task 'long-running' or how to manage that operationally.	1 / 3
Progressive Disclosure	The content has some structural organization with headers and sections, but there are no references to external files and no bundle files. The content is relatively short so deep nesting isn't an issue, but the sections themselves contain low-value content that doesn't warrant the structure.	2 / 3
	Total	5 / 12 Passed

Description

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This description is essentially a placeholder with no meaningful content. It fails on every dimension: it doesn't describe any capabilities, includes no natural trigger terms, doesn't explain when to use the skill, and provides nothing to distinguish it from other skills. The only information conveyed is the invocation command.

Suggestions

Add concrete actions describing what the skill does (e.g., 'Runs long-duration test suites, monitors test progress, and reports results').

Add an explicit 'Use when...' clause with natural trigger terms (e.g., 'Use when the user needs to run extended test suites, long-running tests, or performance/endurance tests').

Replace the internal identifier-focused description with user-facing language that explains the skill's purpose and distinguishes it from other testing-related skills.

Dimension	Reasoning	Score
Specificity	The description provides no concrete actions whatsoever. 'Agent skill for test-long-runner' is entirely vague and does not describe what the skill actually does.	1 / 3
Completeness	Neither 'what does this do' nor 'when should Claude use it' is answered. The description only states how to invoke it, not what it does or when it should be selected.	1 / 3
Trigger Term Quality	The only terms present are 'test-long-runner' and '$agent-test-long-runner', which are internal identifiers rather than natural keywords a user would say. No natural language trigger terms are included.	1 / 3
Distinctiveness Conflict Risk	The description is so generic that it provides no distinguishing information. 'Agent skill for test-long-runner' could be anything, making it impossible to differentiate from other skills.	1 / 3
	Total	4 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: ruvnet/ruflo
Commit: cc8830d

Reviewed: 3 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.