Run and write Ark Chainsaw tests with mock-llm. Use for running tests, debugging failures, or creating new e2e tests.
81
73%
Does it follow best practices?
Impact
92%
4.84xAverage score across 3 eval scenarios
Advisory
Suggest reviewing before use
Optimize this skill with Tessl
npx tessl skill review --optimize ./.claude/skills/chainsaw/SKILL.mdQuality
Discovery
75%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description is functional and clearly scoped to a specific tool (Ark Chainsaw with mock-llm), which makes it distinctive and unlikely to conflict with other skills. It successfully covers both 'what' and 'when' dimensions. However, it could benefit from more specific action descriptions and additional natural trigger terms to improve discoverability.
Suggestions
Expand the specific actions listed, e.g., 'configure mock-llm responses, analyze test output, set up test fixtures' to improve specificity.
Add natural keyword variations users might use, such as 'end-to-end tests', 'test failures', 'chainsaw test suite', or 'mock LLM'.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (Ark Chainsaw tests with mock-llm) and some actions (run, write, debug, create), but doesn't elaborate on what these tests involve or what specific capabilities are provided (e.g., configuring mock responses, analyzing test output, setting up fixtures). | 2 / 3 |
Completeness | Clearly answers both 'what' (run and write Ark Chainsaw tests with mock-llm) and 'when' ('Use for running tests, debugging failures, or creating new e2e tests'), providing explicit trigger scenarios. | 3 / 3 |
Trigger Term Quality | Includes some relevant terms like 'Ark Chainsaw tests', 'mock-llm', 'e2e tests', 'debugging failures', and 'running tests'. However, it misses natural variations users might say such as 'end-to-end tests', 'test failures', 'chainsaw test suite', or 'mock LLM responses'. | 2 / 3 |
Distinctiveness Conflict Risk | Very specific niche targeting 'Ark Chainsaw tests with mock-llm' — this is unlikely to conflict with other skills due to the highly specific product/tool name and testing context. | 3 / 3 |
Total | 10 / 12 Passed |
Implementation
72%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-structured, concise skill that excels at progressive disclosure and token efficiency. Its main weakness is that the 'Writing Tests' section lacks any inline actionable guidance, fully deferring to external files, which leaves the skill somewhat incomplete as a standalone reference for test creation. The running tests section is strong with concrete commands.
Suggestions
Add a minimal inline example or key steps for writing a new test (e.g., the essential chainsaw-test.yaml structure) so the skill is actionable without requiring external file reads.
Include a brief sequential workflow for creating a new test: create directory → write manifests → write test definition → run and validate.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The content is lean and efficient. Every section provides only essential information without explaining concepts Claude already knows. No unnecessary preamble or padding. | 3 / 3 |
Actionability | Running tests has concrete, copy-paste-ready commands. However, the 'Writing Tests' section delegates entirely to external files without providing any inline actionable guidance, and the test structure section is descriptive rather than instructive. | 2 / 3 |
Workflow Clarity | The running tests section provides clear commands including debug mode, but there's no explicit workflow sequence for writing tests (e.g., steps to create a new test, validate it, run it). The writing process is entirely deferred to external references. | 2 / 3 |
Progressive Disclosure | Content is well-structured with a concise overview and clear one-level-deep references to `tests/CLAUDE.md` and `examples.md` for detailed patterns. The skill appropriately keeps the overview brief and delegates comprehensive content. | 3 / 3 |
Total | 10 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
f4bfd2d
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.