CtrlK
BlogDocsLog inGet started
Tessl Logo

agent-test-long-runner

Agent skill for test-long-runner - invoke with $agent-test-long-runner

37

0.98x
Quality

3%

Does it follow best practices?

Impact

96%

0.98x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./.agents/skills/agent-test-long-runner/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

0%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This description is essentially a placeholder that provides no useful information about the skill's purpose, capabilities, or appropriate usage context. It only contains an invocation command and a vague label, making it impossible for Claude to determine when to select this skill from a list of available options.

Suggestions

Add a clear statement of what the skill does with concrete actions (e.g., 'Runs long-duration test suites, monitors test progress, and reports results').

Add an explicit 'Use when...' clause with natural trigger terms users would say (e.g., 'Use when the user needs to run extended test suites, long-running tests, or performance/endurance tests').

Replace the invocation instruction ('invoke with $agent-test-long-runner') with capability-focused language — invocation details belong in the skill body, not the description.

DimensionReasoningScore

Specificity

The description provides no concrete actions whatsoever. 'Agent skill for test-long-runner' is entirely vague and does not describe what the skill actually does.

1 / 3

Completeness

Neither 'what does this do' nor 'when should Claude use it' is answered. The description only states how to invoke it, not what it does or when it should be selected.

1 / 3

Trigger Term Quality

The only terms present are 'test-long-runner' and '$agent-test-long-runner', which are internal identifiers rather than natural keywords a user would say. No natural language trigger terms are included.

1 / 3

Distinctiveness Conflict Risk

The description is so generic that it provides no distinguishing information. 'Agent skill for test-long-runner' could be anything and offers no clear niche or distinct triggers.

1 / 3

Total

4

/

12

Passed

Implementation

7%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill provides no actionable or novel information beyond what Claude already knows. It consists entirely of generic advice ('be thorough', 'take your time', 'document everything') and lists of broad capabilities that are inherent to Claude. There is nothing specific about handling long-running tasks, managing context over extended interactions, or any concrete methodology.

Suggestions

Replace generic advice with specific, actionable techniques for managing long-running tasks (e.g., how to checkpoint progress, how to structure output for tasks that span many tool calls, how to handle context window limits).

Add concrete examples with executable code or specific commands for at least one use case (e.g., a step-by-step codebase analysis workflow with specific tools and validation steps).

Define what makes this agent different from default Claude behavior - include specific strategies, tool usage patterns, or output templates that are unique to long-running task management.

Remove the Capabilities section entirely as it just restates general Claude abilities, and replace the Instructions section with a concrete workflow that includes validation checkpoints.

DimensionReasoningScore

Conciseness

The content is almost entirely generic advice that Claude already knows ('Take Your Time', 'Be Thorough', 'Document Everything'). It explains nothing Claude doesn't already understand and adds no domain-specific knowledge. The capabilities list is just a restatement of general Claude abilities.

1 / 3

Actionability

There are no concrete commands, code examples, specific tools, or executable guidance. Everything is vague direction like 'Deep dive into codebases' and 'Comprehensive research across multiple sources' without any specifics on how to actually do these things.

1 / 3

Workflow Clarity

The numbered 'Instructions' are generic platitudes (Take Your Time, Be Thorough, Iterate) rather than a meaningful workflow. There are no concrete steps, no validation checkpoints, and no sequenced process for any task.

1 / 3

Progressive Disclosure

The content has some structural organization with clear section headers (Capabilities, Instructions, Output Format, Example Use Cases), but there are no references to external files and the content itself is too thin to warrant splitting. The organization exists but serves little purpose given the lack of substantive content.

2 / 3

Total

5

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository
ruvnet/ruflo
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.