skill-arc-reactor

Build new Claude skills from scratch or supercharge existing ones through rigorous evaluation and iterative improvement. Use when the user wants to create, build, improve, evaluate, audit, enhance, benchmark, test, or package a skill. Also trigger for "turn this into a skill", "make this reusable", "I keep repeating this workflow", or references to SKILL.md, skill frontmatter, description optimization, or skill packaging. Do NOT use for general coding tasks, document creation, or other non-skill workflows. Even if the user just says "skill" in the context of Claude capabilities, this is likely the right skill to load.

Quality

92%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Advisory

Suggest reviewing before use

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a strong, well-crafted description that excels across all dimensions. It provides specific capabilities, extensive natural trigger terms, explicit 'Use when' and 'Do NOT use' clauses, and clear boundaries that distinguish it from general coding or document skills. The inclusion of quoted user phrases like 'turn this into a skill' and 'I keep repeating this workflow' is particularly effective for matching real user intent.

Dimension	Reasoning	Score
Specificity	Lists multiple concrete actions: build new skills, improve existing ones through evaluation, iterative improvement, benchmarking, testing, packaging. Also specifies what NOT to use it for, adding further specificity.	3 / 3
Completeness	Clearly answers both 'what' (build new skills, improve existing ones through evaluation and iteration) and 'when' (explicit 'Use when...' clause with extensive trigger terms, plus a 'Do NOT use' clause for disambiguation). Both dimensions are thoroughly addressed.	3 / 3
Trigger Term Quality	Excellent coverage of natural trigger terms: 'create', 'build', 'improve', 'evaluate', 'audit', 'enhance', 'benchmark', 'test', 'package', 'turn this into a skill', 'make this reusable', 'I keep repeating this workflow', 'SKILL.md', 'skill frontmatter', 'description optimization', 'skill packaging'. These are terms users would naturally say.	3 / 3
Distinctiveness Conflict Risk	Highly distinctive with a clear niche around skill creation/improvement. The explicit 'Do NOT use for general coding tasks, document creation, or other non-skill workflows' clause actively reduces conflict risk with other skills.	3 / 3
	Total	12 / 12 Passed

Implementation

85%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-crafted meta-skill with strong actionability and excellent progressive disclosure. The two-mode structure (Create/Enhance) is clearly delineated with concrete phases, validation checkpoints, and specific tooling references. The main weakness is moderate verbosity — some principles are repeated across sections, and the Improvement Philosophy section partially duplicates guidance already given in the phases — but the content density is generally high enough to justify its length.

Dimension	Reasoning	Score
Conciseness	The skill is lengthy (~300+ lines) but most content is structural and actionable. Some sections could be tightened — the Improvement Philosophy section restates points already made earlier (e.g., description importance, gotchas), and some guidance like 'explain why over MUST' appears multiple times. However, it avoids explaining concepts Claude already knows and stays focused on novel workflow instructions.	2 / 3
Actionability	Highly actionable throughout: concrete CLI commands (python scripts/aggregate_benchmark.py, python scripts/package_skill.py), specific JSON schemas for eval files, exact file paths for references, clear phase-by-phase instructions with specific deliverables at each step. The test case design guidance includes concrete categories and a complete JSON example.	3 / 3
Workflow Clarity	Both Create and Enhance modes follow clearly numbered phases with explicit sequencing. Phase 5 (Run & Evaluate) has a particularly well-structured 5-step workflow with validation checkpoints ('Wait for user feedback before making changes'), feedback loops ('fix and re-validate'), and explicit ordering ('Launch everything at once', 'Don't wait idle'). The iterate phase includes clear stop conditions.	3 / 3
Progressive Disclosure	Excellent progressive disclosure: SKILL.md serves as the orchestration overview, with detailed content delegated to clearly-signaled reference files (references/skill-anatomy.md, references/writing-guide.md, etc.). The reference tables at the bottom provide clear navigation with 'When to read' guidance. Agent files, scripts, and hooks are each in their own well-organized tables. No deeply nested references.	3 / 3
	Total	11 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: endor-matt/Arc-Reactor-Skill-Evaluator
Commit: 95142b6

Reviewed: 28 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.