Analyze ARIS usage logs and propose optimizations to SKILL.md files, reviewer prompts, and workflow defaults. Outer-loop harness optimization inspired by Meta-Harness (Lee et al., 2026). Use when user says "优化技能", "meta optimize", "improve skills", "分析使用记录", or wants to optimize ARIS's own harness components based on accumulated experience.
89
88%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong skill description that clearly defines a specific niche (meta-optimization of ARIS harness components), provides concrete actions, and includes explicit trigger terms in both English and Chinese. The 'Use when' clause is well-constructed with multiple natural trigger phrases. The only minor concern is the academic reference ('Lee et al., 2026') which adds context but doesn't directly help with skill selection.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: 'Analyze ARIS usage logs', 'propose optimizations to SKILL.md files, reviewer prompts, and workflow defaults'. Also references a specific methodology ('outer-loop harness optimization inspired by Meta-Harness'). | 3 / 3 |
Completeness | Clearly answers both 'what' (analyze ARIS usage logs and propose optimizations to SKILL.md files, reviewer prompts, and workflow defaults) and 'when' (explicit 'Use when' clause with specific trigger phrases and a conceptual trigger condition). | 3 / 3 |
Trigger Term Quality | Includes natural trigger terms in both English and Chinese: '优化技能', 'meta optimize', 'improve skills', '分析使用记录', plus the conceptual trigger 'optimize ARIS's own harness components based on accumulated experience'. Good coverage of how users would naturally phrase this request. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive niche: meta-optimization of ARIS's own skill files and harness components based on usage logs. The specific domain (ARIS self-optimization) and bilingual trigger terms make it very unlikely to conflict with other skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-structured, highly actionable skill with a clear multi-step workflow and strong validation checkpoints (data sufficiency check, cross-model review gate, user approval requirement). Its main weakness is verbosity — the contextual framing, Meta-Harness inspiration paragraphs, acknowledgements, and some explanatory text could be trimmed significantly without losing actionable content. The progressive disclosure is reasonable but the skill itself is long enough that some sections (event schema, triggering mechanisms) could be externalized.
Suggestions
Remove or drastically shorten the Context section, Acknowledgements, and 'What This Skill Optimizes' explanatory text — Claude doesn't need the Meta-Harness motivation or the distinction between harness and research artifacts explained at length.
Move the Event Schema Reference section to a separate file (e.g., event-schema.md) and reference it from the main skill to reduce the SKILL.md length.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is fairly long (~200+ lines) and includes some unnecessary context (e.g., explaining what ARIS is, the Meta-Harness inspiration paragraph, the 'Not optimized' clarification). The component table and event schema are useful but the framing paragraphs could be tighter. The acknowledgements section with a citation adds no actionable value. | 2 / 3 |
Actionability | The skill provides concrete, executable bash scripts for data checking, specific diff format for patches, exact MCP tool invocation syntax for cross-model review, specific JSONL event schemas, and clear examples of optimization opportunity tables. The guidance is copy-paste ready throughout. | 3 / 3 |
Workflow Clarity | The 7-step workflow (Steps 0-6) is clearly sequenced with explicit validation checkpoints: Step 0 checks data availability before proceeding, Step 4 provides cross-model adversarial review as a gate, Step 6 requires user approval before applying changes and includes backup/rollback. The feedback loop (reviewer score < 7 → explain what additional evidence needed) is well-defined. | 3 / 3 |
Progressive Disclosure | The skill references several external files (shared-references/output-versioning.md, output-manifest.md, output-language.md, review-tracing.md, tools/save_trace.sh, templates/claude-hooks/meta_logging.json) which is good progressive disclosure. However, no bundle files are provided to verify these references exist, and the main SKILL.md itself is quite long — the event schema reference and triggering sections could potentially be split into separate files to keep the main skill leaner. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
allowed_tools_field | 'allowed-tools' contains unusual tool name(s) | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 9 / 11 Passed | |
2028ac4
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.