ablation-planner

Use when main results pass result-to-claim (`claim_supported = yes` or `partial`) and ablation studies are needed for paper submission. A secondary Codex agent designs ablations from a reviewer's perspective; the local executor reviews feasibility and implements.

Quality

68%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./skills/skills-codex/ablation-planner/SKILL.md

Quality

Discovery

75%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description clearly defines when to use the skill with an explicit conditional trigger and describes a distinct workflow involving ablation study design and implementation. However, the specific actions could be more concretely enumerated, and the trigger terms are somewhat narrow and technical, potentially missing natural language variations users might employ when requesting ablation studies.

Suggestions

Add more natural trigger term variations such as 'ablation experiments,' 'component analysis,' 'sensitivity analysis,' or 'contribution breakdown' to improve discoverability.

List more concrete actions the skill performs, e.g., 'identifies key components to ablate, generates experiment configurations, runs ablation experiments, and summarizes results in publication-ready tables.'

Dimension	Reasoning	Score
Specificity	The description mentions some specific actions like 'designs ablations from a reviewer's perspective' and 'reviews feasibility and implements,' but the concrete actions are not comprehensively listed—it's unclear exactly what designing ablations entails or what implementation steps are performed.	2 / 3
Completeness	The description explicitly answers both 'what' (a secondary Codex agent designs ablations, local executor reviews feasibility and implements) and 'when' (when main results pass result-to-claim and ablation studies are needed for paper submission), with a clear 'Use when...' clause at the start.	3 / 3
Trigger Term Quality	Includes some relevant terms like 'ablation studies,' 'paper submission,' 'claim_supported,' and 'reviewer's perspective,' but misses common natural variations a user might say such as 'ablation experiments,' 'component analysis,' 'sensitivity analysis,' or 'research paper.' The conditional trigger ('claim_supported = yes or partial') is quite technical and narrow.	2 / 3
Distinctiveness Conflict Risk	This skill occupies a very specific niche—ablation study design for paper submission after result-to-claim validation. The conditional trigger on claim_supported status and the specific multi-agent workflow make it highly unlikely to conflict with other skills.	3 / 3
	Total	10 / 12 Passed

Implementation

62%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill provides a well-structured workflow for ablation study planning with clear sequencing and validation checkpoints. Its main weaknesses are moderate verbosity in the agent prompt and output template sections, and a lack of truly executable/concrete implementation examples (configs, scripts). The workflow logic is sound and the feasibility review step is a strong inclusion.

Suggestions

Add a concrete, executable example of an ablation config file or script modification to make Step 5 more actionable rather than abstract instructions.

Consider extracting the spawn_agent prompt template and the markdown output format into separate referenced files to improve progressive disclosure and reduce the main skill's length.

Provide a minimal worked example showing one complete ablation cycle (from design through result logging) to make the skill more copy-paste actionable.

Dimension	Reasoning	Score
Conciseness	The skill is reasonably efficient but includes some unnecessary elaboration. The structured tables and prompt template are useful, but sections like 'When to Use' and some of the rules restate things that could be more compact. The inline prompt to the spawned agent is lengthy but arguably necessary for delegation.	2 / 3
Actionability	The skill provides a clear workflow and structured output format, but lacks executable code. The 'spawn_agent' block is a pseudo-specification rather than executable code, and Step 5 gives general instructions ('create configs or scripts') without concrete examples of what those configs or scripts look like. The guidance is specific in intent but not copy-paste ready.	2 / 3
Workflow Clarity	The five-step workflow is clearly sequenced with logical progression from context gathering through design, parsing, feasibility review, and implementation. Step 4 serves as an explicit validation/feasibility checkpoint before execution, and Step 5 includes smoke testing before full runs. The feedback loop of proposing cuts and re-prioritizing when budget is tight is well-specified.	3 / 3
Progressive Disclosure	The content is well-structured with clear sections and headers, but it's a monolithic document with no references to supporting files. The detailed prompt template and full table schemas are inline rather than referenced. For a skill of this complexity, separating the agent prompt template or the output format into referenced files would improve organization. However, no bundle files exist to reference.	2 / 3
	Total	9 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
allowed_tools_field	'allowed-tools' contains unusual tool name(s)	Warning

	Total	10 / 11 Passed

Repository: wanshuiyin/Auto-claude-code-research-in-sleep
Commit: a425a71

Reviewed: about 23 hours ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.