Scan Codex session history for skill failures, usage patterns, and coverage gaps. Use when the user wants daily skill-health monitoring or evidence-backed recommendations about installing, improving, merging, or pruning skills.
65
78%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./Infrastructure/references/deferred-skill-context/skill-factory-skill-refactor/SKILL.mdQuality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong description that clearly defines a specific niche (Codex skill health monitoring), lists concrete actions, includes natural trigger terms, and explicitly states both what the skill does and when to use it. The description is concise yet comprehensive, and the domain is distinctive enough to avoid conflicts with other skills.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: 'scan session history for skill failures, usage patterns, and coverage gaps' plus 'installing, improving, merging, or pruning skills'. These are concrete, actionable capabilities. | 3 / 3 |
Completeness | Clearly answers both what ('Scan Codex session history for skill failures, usage patterns, and coverage gaps') and when ('Use when the user wants daily skill-health monitoring or evidence-backed recommendations about installing, improving, merging, or pruning skills'). | 3 / 3 |
Trigger Term Quality | Includes natural keywords users would say: 'skill failures', 'usage patterns', 'coverage gaps', 'skill-health monitoring', 'installing', 'improving', 'merging', 'pruning skills', 'session history'. These are terms a user managing a Codex skill library would naturally use. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive niche focused specifically on Codex session history analysis and skill lifecycle management. Unlikely to conflict with other skills due to the very specific domain of skill-health monitoring and meta-level skill management. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
57%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-structured analytical/process skill that clearly defines scope, constraints, and deliverables with good progressive disclosure to supporting references. Its main weaknesses are the lack of concrete executable examples (no sample command invocations, no example output schema inline) and the separation of validation from the main workflow steps. Some content is mildly redundant across sections (failure handling appears in Constraints, Validation, and Failure mode).
Suggestions
Add an inline example of the keep/improve/merge/retire action table or structured output schema so Claude knows exactly what format to produce.
Include at least one concrete command invocation for the referenced scripts (e.g., `python scripts/scan_codex_sessions.py --scope=last-week --output=json`) to make the procedure more actionable.
Integrate validation checkpoints directly into the procedure steps (e.g., after step 2: 'If any evidence source is missing or unreadable, stop and report the gap') rather than listing them in a separate section.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is reasonably efficient but includes some sections that could be tightened—'Philosophy' bullets are somewhat generic, 'When to use' partially restates the description, and the 'Examples' section lists natural-language prompts rather than adding technical value. Some redundancy between 'Failure mode' and 'Constraints' (e.g., missing evidence handling appears in both). | 2 / 3 |
Actionability | The procedure provides a clear sequence of steps and references concrete scripts and paths, but there are no executable code snippets, command-line invocations, or example output schemas inline. The deliverables describe what to produce but don't show a concrete example of the action table or structured output format. | 2 / 3 |
Workflow Clarity | The 6-step procedure is clearly sequenced and the Validation section adds checkpoints, including a 'fail fast' rule. However, there are no explicit feedback loops (e.g., validate → fix → retry) within the procedure itself, and the validation steps are separated from the workflow rather than integrated as checkpoints between steps. For a process involving potentially destructive recommendations (retire/remove skills), the lack of inline verification gates caps this at 2. | 2 / 3 |
Progressive Disclosure | The skill effectively uses progressive disclosure: it provides a concise overview with clearly signaled one-level-deep references to contract.yaml, session-evidence-workflow.md, and specific scripts. References are well-organized with contextual 'Read when' triggers. The main body stays at summary level while pointing to detailed materials. | 3 / 3 |
Total | 9 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
metadata_version | 'metadata.version' is missing | Warning |
Total | 10 / 11 Passed | |
4c78f98
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.