This skill should be used when the user says "arness code assess", "arn-code-assess", "assess codebase", "technical review", "codebase assessment", "find improvements", "what should I improve", "tech debt review", "tech debt audit", "pattern compliance check", "codebase health check", "assess the project", "improvement plan", "review my codebase", "what needs fixing", "code quality check", "audit my code", "run an assessment", or wants a comprehensive technical assessment of the codebase against stored patterns followed by prioritized improvement execution through the full Arness pipeline.
77
72%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./plugins/arn-code/skills/arn-code-assess/SKILL.mdQuality
Discovery
82%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description excels at providing explicit trigger terms and clearly stating when to use the skill, but it is weak on specificity of what the skill actually does beyond a high-level summary. The heavy reliance on a long list of trigger phrases compensates for completeness but the actual capabilities ('assessment against stored patterns', 'prioritized improvement execution') remain vague. Several generic trigger terms risk conflicting with other code quality or review skills.
Suggestions
Add specific concrete actions the skill performs, e.g., 'Scans codebase for pattern compliance violations, generates a prioritized list of improvements, and executes fixes in order of impact' instead of the vague 'comprehensive technical assessment'.
Reduce conflict risk by clarifying what distinguishes this from generic code review skills — e.g., mention the specific stored patterns, the Arness pipeline stages, or unique outputs like assessment reports.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description mentions 'comprehensive technical assessment of the codebase against stored patterns' and 'prioritized improvement execution through the full Arness pipeline', which names the domain and some actions but doesn't list multiple specific concrete actions like what the assessment checks, what outputs are produced, or what the pipeline steps are. | 2 / 3 |
Completeness | The description explicitly answers both 'what' (comprehensive technical assessment against stored patterns with prioritized improvement execution) and 'when' (with a detailed list of trigger phrases prefaced by 'This skill should be used when'). The 'when' is very explicit. | 3 / 3 |
Trigger Term Quality | The description includes an extensive list of natural trigger phrases users would actually say, such as 'assess codebase', 'tech debt review', 'code quality check', 'what needs fixing', 'audit my code', 'find improvements', covering many common variations and natural language patterns. | 3 / 3 |
Distinctiveness Conflict Risk | While the Arness-specific trigger terms ('arness code assess', 'arn-code-assess', 'Arness pipeline') are distinctive, many of the generic triggers like 'code quality check', 'what needs fixing', 'find improvements', and 'tech debt review' could easily overlap with other code review or linting skills. | 2 / 3 |
Total | 10 / 12 Passed |
Implementation
62%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-structured orchestration skill with excellent workflow clarity — the 7 decision gates, resumability detection, and sequential execution with conflict checking are thoroughly defined. Its main weaknesses are moderate verbosity (repeated progress bars, some sections that could be more concise) and incomplete actionability since it references external protocol files that aren't bundled. The skill correctly positions itself as a sequencer that delegates to sub-skills, but the sheer length could be reduced.
Suggestions
Remove or consolidate the repeated ASCII progress bar displays — define the format once and just reference the current stage name at each step, saving ~50 tokens per repetition.
Provide the referenced bundle files (assessment-protocol.md, orchestration-flow.md) or inline the critical parts needed for actionability — currently key procedures like 'merge, deduplication, ID assignment' and 'conflict detection algorithm' are undefined in the evaluated content.
Add a concrete example of an agent invocation showing exact tool call syntax, so Claude knows the precise format rather than inferring from descriptions like 'invoke arn-code-architect with the architect assessment prompt template'.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is quite long (~300+ lines) but most content is structural (decision gates, workflow steps, error handling). Some sections could be tightened — e.g., the progress bar ASCII art is repeated 10+ times consuming significant tokens, and some explanations are slightly verbose. However, it avoids explaining concepts Claude already knows and stays focused on orchestration logic. | 2 / 3 |
Actionability | The skill provides clear gate definitions, specific tool invocations (Skill: arn-code:arn-code-plan, Agent tool calls), and structured decision trees. However, it lacks concrete executable examples — no actual command syntax for agent invocations, no example assessment report format, and references external protocol files (assessment-protocol.md, orchestration-flow.md) that aren't provided. The guidance is specific but not fully copy-paste ready. | 2 / 3 |
Workflow Clarity | Excellent workflow clarity with 13 clearly numbered steps, 7 explicit decision gates in a summary table, a pipeline overview diagram, resumability detection with artifact-based state table, conflict detection between specs, and explicit error recovery paths (retry/skip/abort). Validation checkpoints are present at testing (G6) and conflict detection (G5) stages. | 3 / 3 |
Progressive Disclosure | The skill references two external files (assessment-protocol.md and orchestration-flow.md) which is good progressive disclosure design, but these bundle files are not provided for evaluation. The SKILL.md itself is quite long and monolithic — the decision gate table, error handling section, and constraints could potentially be split out. The references are clearly signaled with Read commands, but the main file carries a lot of inline detail. | 2 / 3 |
Total | 9 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 10 / 11 Passed | |
1fe948f
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.