do-in-steps

Execute complex tasks through sequential sub-agent orchestration with intelligent model selection, meta-judge → LLM-as-a-judge verification

Quality

19%

Does it follow best practices?

Run evals on this skill

Adds up to 20 points to the overall score

View guide

Securityby

Passed

No findings from the security scan

Fix and improve this skill with Tessl

tessl review fix ./plugins/sadd/skills/do-in-steps/SKILL.md

Quality

Content

39%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill has excellent workflow clarity with well-defined phases, validation checkpoints, retry loops, and escalation paths. However, it is severely undermined by extreme verbosity — repeating critical rules 4-6 times each, including massive inline examples with full prompt reproductions, and explaining orchestration concepts at length. The lack of any progressive disclosure (no bundle files, no external references) makes this a monolithic document that would consume significant context window space.

Suggestions

Reduce content by 60%+: eliminate repeated instructions (e.g., 'meta-judge FIRST', 'reuse spec across retries' each appear 4-6 times), remove threatening language, and consolidate redundant sections like the duplicate best practices that restate Phase 2-3 content.

Extract the three full examples (with their complete judge prompt reproductions) into a separate EXAMPLES.md file, keeping only a brief example summary in the main SKILL.md.

Move the Context Format Reference, Error Handling templates, and Best Practices tables into a separate REFERENCE.md, linking from the main skill with clear one-level-deep references.

Replace pseudocode Task tool dispatch notation with actual tool call syntax or a minimal concrete example, and remove the redundant 'Dispatch Example' that duplicates the dispatch instructions already given in section 3.5.

Dimension	Reasoning	Score
Conciseness	This skill is extremely verbose at ~800+ lines. It extensively explains orchestration concepts Claude already understands, repeats the same instructions multiple times (e.g., 'meta-judge FIRST in dispatch order' appears 4+ times, 'reuse same meta-judge specification across retries' repeated 6+ times), includes massive example judge prompts that are largely redundant, and contains threatening language ('you will be killed immediately') that wastes tokens. The content could be reduced by 60-70% without losing actionable information.	1 / 3
Actionability	The skill provides structured templates and prompt formats that are somewhat actionable, but it's primarily orchestration instructions rather than executable code. The bash command is just `mkdir -p .specs/reports`. Most 'code blocks' are prompt templates or markdown formatting examples rather than executable code. The Task tool dispatch examples use pseudocode notation rather than actual tool call syntax. However, the prompt templates and decision matrices do provide concrete, copy-paste-ready structures.	2 / 3
Workflow Clarity	The multi-step workflow is clearly sequenced across 4 phases with explicit validation checkpoints (judge verification after each step), feedback loops (retry with judge feedback, max 3 retries), error escalation paths, and clear decision trees for pass/fail/retry. The execution flow diagram and step-by-step dispatch protocol are well-defined with explicit verification gates before proceeding.	3 / 3
Progressive Disclosure	This is a monolithic wall of text with no references to external files despite being 800+ lines. The three full examples (Examples 1, 2, 3) with their complete judge prompt reproductions could easily be in a separate EXAMPLES.md. The context format references, error handling templates, and best practices could be split into supporting files. Everything is inlined in a single massive document with no bundle files to support it.	1 / 3
	Total	7 / 12 Passed

Description

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This description is heavily laden with technical jargon and buzzwords while failing to communicate any concrete capabilities or use cases. It does not describe what specific tasks the skill handles, provides no natural trigger terms a user would use, and lacks any 'Use when...' guidance. It would be nearly impossible for Claude to correctly select this skill from a pool of available skills.

Suggestions

Replace 'execute complex tasks' with specific, concrete actions the skill performs (e.g., 'Breaks down multi-step problems into subtasks, delegates to specialized models, and verifies results').

Add a 'Use when...' clause with natural trigger terms users would actually say, such as 'Use when the user asks for multi-step analysis, complex workflows, or tasks requiring verification of intermediate results'.

Remove or explain jargon like 'meta-judge → LLM-as-a-judge verification' in plain language so the description is understandable and distinguishable from other skills.

Dimension	Reasoning	Score
Specificity	The description uses abstract, buzzword-heavy language like 'sequential sub-agent orchestration', 'intelligent model selection', and 'meta-judge → LLM-as-a-judge verification' without describing any concrete actions a user would recognize. 'Execute complex tasks' is extremely vague.	1 / 3
Completeness	The 'what' is vague ('execute complex tasks') and there is no 'when' clause at all. There is no explicit guidance on when Claude should select this skill, and the capabilities described are too abstract to infer usage scenarios.	1 / 3
Trigger Term Quality	The terms used are highly technical jargon ('sub-agent orchestration', 'meta-judge', 'LLM-as-a-judge') that no user would naturally say when requesting help. There are no natural keywords a user would use to trigger this skill.	1 / 3
Distinctiveness Conflict Risk	'Execute complex tasks' is extremely generic and could conflict with virtually any skill. The technical jargon about orchestration and verification doesn't help distinguish what domain or use case this skill serves.	1 / 3
	Total	4 / 12 Passed

Validation

81%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 9 / 11 Passed

Validation for skill structure

Criteria	Description	Result
skill_md_line_count	SKILL.md is long (1417 lines); consider splitting into references/ and linking	Warning
frontmatter_unknown_keys	Unknown frontmatter key(s) found; consider removing or moving to metadata	Warning

	Total	9 / 11 Passed

Repository: NeoLabHQ/context-engineering-kit
Path: plugins/sadd/skills/do-in-steps/SKILL.md
Commit: 3711edf

Reviewed: 1 day ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.