CtrlK
BlogDocsLog inGet started
Tessl Logo

do-in-steps

Execute complex tasks through sequential sub-agent orchestration with intelligent model selection, meta-judge → LLM-as-a-judge verification

27

Quality

19%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Fix and improve this skill with Tessl

tessl review fix ./plugins/sadd/skills/do-in-steps/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Content

39%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill has excellent workflow clarity with well-defined phases, validation checkpoints, retry loops, and escalation paths. However, it is severely undermined by extreme verbosity — repeating critical rules 4-6 times each, including massive inline examples with full prompt reproductions, and explaining orchestration concepts at length. The lack of any progressive disclosure (no bundle files, no external references) makes this a monolithic document that would consume significant context window space.

Suggestions

Reduce content by 60%+: eliminate repeated instructions (e.g., 'meta-judge FIRST', 'reuse spec across retries' each appear 4-6 times), remove threatening language, and consolidate redundant sections like the duplicate best practices that restate Phase 2-3 content.

Extract the three full examples (with their complete judge prompt reproductions) into a separate EXAMPLES.md file, keeping only a brief example summary in the main SKILL.md.

Move the Context Format Reference, Error Handling templates, and Best Practices tables into a separate REFERENCE.md, linking from the main skill with clear one-level-deep references.

Replace pseudocode Task tool dispatch notation with actual tool call syntax or a minimal concrete example, and remove the redundant 'Dispatch Example' that duplicates the dispatch instructions already given in section 3.5.

DimensionReasoningScore

Conciseness

This skill is extremely verbose at ~800+ lines. It extensively explains orchestration concepts Claude already understands, repeats the same instructions multiple times (e.g., 'meta-judge FIRST in dispatch order' appears 4+ times, 'reuse same meta-judge specification across retries' repeated 6+ times), includes massive example judge prompts that are largely redundant, and contains threatening language ('you will be killed immediately') that wastes tokens. The content could be reduced by 60-70% without losing actionable information.

1 / 3

Actionability

The skill provides structured templates and prompt formats that are somewhat actionable, but it's primarily orchestration instructions rather than executable code. The bash command is just `mkdir -p .specs/reports`. Most 'code blocks' are prompt templates or markdown formatting examples rather than executable code. The Task tool dispatch examples use pseudocode notation rather than actual tool call syntax. However, the prompt templates and decision matrices do provide concrete, copy-paste-ready structures.

2 / 3

Workflow Clarity

The multi-step workflow is clearly sequenced across 4 phases with explicit validation checkpoints (judge verification after each step), feedback loops (retry with judge feedback, max 3 retries), error escalation paths, and clear decision trees for pass/fail/retry. The execution flow diagram and step-by-step dispatch protocol are well-defined with explicit verification gates before proceeding.

3 / 3

Progressive Disclosure

This is a monolithic wall of text with no references to external files despite being 800+ lines. The three full examples (Examples 1, 2, 3) with their complete judge prompt reproductions could easily be in a separate EXAMPLES.md. The context format references, error handling templates, and best practices could be split into supporting files. Everything is inlined in a single massive document with no bundle files to support it.

1 / 3

Total

7

/

12

Passed

Description

0%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This description is heavily laden with technical jargon and buzzwords while failing to communicate any concrete capabilities or use cases. It does not describe what specific tasks the skill handles, provides no natural trigger terms a user would use, and lacks any 'Use when...' guidance. It would be nearly impossible for Claude to correctly select this skill from a pool of available skills.

Suggestions

Replace 'execute complex tasks' with specific, concrete actions the skill performs (e.g., 'Breaks down multi-step problems into subtasks, delegates to specialized models, and verifies results').

Add a 'Use when...' clause with natural trigger terms users would actually say, such as 'Use when the user asks for multi-step analysis, complex workflows, or tasks requiring verification of intermediate results'.

Remove or explain jargon like 'meta-judge → LLM-as-a-judge verification' in plain language so the description is understandable and distinguishable from other skills.

DimensionReasoningScore

Specificity

The description uses abstract, buzzword-heavy language like 'sequential sub-agent orchestration', 'intelligent model selection', and 'meta-judge → LLM-as-a-judge verification' without describing any concrete actions a user would recognize. 'Execute complex tasks' is extremely vague.

1 / 3

Completeness

The 'what' is vague ('execute complex tasks') and there is no 'when' clause at all. There is no explicit guidance on when Claude should select this skill, and the capabilities described are too abstract to infer usage scenarios.

1 / 3

Trigger Term Quality

The terms used are highly technical jargon ('sub-agent orchestration', 'meta-judge', 'LLM-as-a-judge') that no user would naturally say when requesting help. There are no natural keywords a user would use to trigger this skill.

1 / 3

Distinctiveness Conflict Risk

'Execute complex tasks' is extremely generic and could conflict with virtually any skill. The technical jargon about orchestration and verification doesn't help distinguish what domain or use case this skill serves.

1 / 3

Total

4

/

12

Passed

Validation

81%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation9 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

skill_md_line_count

SKILL.md is long (1417 lines); consider splitting into references/ and linking

Warning

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

9

/

11

Passed

Repository
NeoLabHQ/context-engineering-kit
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.