CtrlK
BlogDocsLog inGet started
Tessl Logo

arn-code-execute-task

This skill should be used when the user says "execute task N", "run task N", "implement task N", "re-run task N", "retry task N", "run single task", or wants to execute a single specific task from the task list with optional review. This is for ONE task only — for executing the full plan (all tasks), use arn-code-execute-plan instead.

61

Quality

72%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Fix and improve this skill with Tessl

tessl review fix ./plugins/arn-code/skills/arn-code-execute-task/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Content

55%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill is highly actionable and has excellent workflow clarity with clear sequencing, validation checkpoints, and comprehensive error handling. However, it is far too verbose for a SKILL.md — at this length, significant portions (error handling branches, progress tracker update logic, detailed review cycle mechanics) should be extracted into reference files. The monolithic structure makes it difficult to navigate and wastes token budget.

Suggestions

Extract the detailed error handling section (especially the test failure branches with AskUserQuestion prompts) into a separate reference file like `references/error-handling.md`

Move the PROGRESS_TRACKER.json update logic (Step 5.4) into a reference file, keeping only a one-line summary in the main workflow

Trim explanatory phrases and reduce verbosity throughout — e.g., the review option prompts and visual testing config extraction details could be much more concise

Create a reference file for the review cycle mechanics (resume vs fresh dispatch, max retries, escalation) to keep the main workflow lean

DimensionReasoningScore

Conciseness

The skill is extremely verbose at ~200+ lines with extensive detail that could be significantly compressed. It over-specifies many steps (e.g., the full directory tree, detailed PROGRESS_TRACKER.json update logic, multi-paragraph error handling branches with exact AskUserQuestion prompts) that could be summarized or moved to reference files. Much of this reads like implementation specification rather than concise guidance.

1 / 3

Actionability

The skill provides highly specific, concrete guidance at every step: exact file paths, specific tool names (TaskUpdate, AskUserQuestion, Task tool), precise parameters to pass, exact JSON field names to update, and specific user-facing messages. Every action is clearly specified with no ambiguity about what to do.

3 / 3

Workflow Clarity

The workflow is clearly sequenced across 6 numbered steps with explicit validation checkpoints (verify task exists, verify project directory, check dependencies, review cycles with max 2 retries then escalate). Error recovery is thoroughly covered with specific branching logic for test failures, resume failures, and review cycle limits.

3 / 3

Progressive Disclosure

The entire skill is a monolithic wall of text with no content split into reference files despite its length and complexity. The error handling section alone could be a separate reference file. The detailed PROGRESS_TRACKER.json update logic, visual testing config extraction, and test failure branching logic would all benefit from being in separate reference documents. No bundle files are provided to support this large skill.

1 / 3

Total

8

/

12

Passed

Description

89%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This description excels at trigger term coverage and completeness, providing explicit trigger phrases and clearly distinguishing itself from the related full-plan execution skill. Its main weakness is the lack of specificity about what concrete actions are performed when 'executing a task' — the description tells you when to use it very well but is somewhat vague about the mechanics of what it actually does.

Suggestions

Add concrete action descriptions for what 'executing a task' involves (e.g., 'Reads task details from the task list, implements code changes, runs tests, and optionally pauses for review').

DimensionReasoningScore

Specificity

The description names the domain (executing a single task from a task list) and mentions 'optional review', but doesn't describe concrete actions beyond 'execute/run/implement a task'. It lacks specifics about what executing a task entails (e.g., running code, modifying files, etc.).

2 / 3

Completeness

Clearly answers both 'what' (execute a single specific task from the task list with optional review) and 'when' (explicit trigger phrases listed, plus distinction from the full plan execution skill). The 'Use when' equivalent is the opening clause.

3 / 3

Trigger Term Quality

Excellent coverage of natural trigger phrases: 'execute task N', 'run task N', 'implement task N', 're-run task N', 'retry task N', 'run single task'. These are realistic phrases users would actually say and cover multiple variations including retry scenarios.

3 / 3

Distinctiveness Conflict Risk

Highly distinctive — explicitly scopes to ONE task only and directly contrasts with 'arn-code-execute-plan' for full plan execution. The specific trigger phrases ('execute task N', 'run task N') create a clear niche that is unlikely to conflict with other skills.

3 / 3

Total

11

/

12

Passed

Validation

90%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation10 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

10

/

11

Passed

Repository
AppsVortex/arness
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.