Run and write Ark Chainsaw tests with mock-llm. Use for running tests, debugging failures, or creating new e2e tests.
82
73%
Does it follow best practices?
Impact
95%
1.37xAverage score across 3 eval scenarios
Advisory
Suggest reviewing before use
Optimize this skill with Tessl
npx tessl skill review --optimize ./.claude/skills/chainsaw/SKILL.mdQuality
Discovery
75%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description is functional and well-structured with a clear 'Use for...' clause that covers the when. Its main strength is distinctiveness due to project-specific terminology. However, the actions described are somewhat high-level and could benefit from more concrete details about what the skill actually does (e.g., specific test patterns, configuration, assertion types).
Suggestions
Add more specific concrete actions beyond 'run and write' — e.g., 'configure mock-llm responses, set up test fixtures, analyze test output logs'
Include additional natural trigger terms users might say, such as 'integration tests', 'end-to-end', 'test suite', or 'CI failures'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (Ark Chainsaw tests with mock-llm) and some actions (run, write, debug, create), but doesn't elaborate on what these tests involve or what specific capabilities are provided beyond high-level verbs. | 2 / 3 |
Completeness | Clearly answers both what ('Run and write Ark Chainsaw tests with mock-llm') and when ('Use for running tests, debugging failures, or creating new e2e tests') with explicit trigger scenarios. | 3 / 3 |
Trigger Term Quality | Includes some relevant keywords like 'Ark Chainsaw tests', 'mock-llm', 'e2e tests', 'debugging failures', but these are fairly project-specific terms. Missing common variations a user might say like 'integration tests', 'test failures', or 'end-to-end'. | 2 / 3 |
Distinctiveness Conflict Risk | Very specific to 'Ark Chainsaw' and 'mock-llm', which are distinct enough project-specific terms that this is unlikely to conflict with other testing or development skills. | 3 / 3 |
Total | 10 / 12 Passed |
Implementation
72%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a solid, actionable skill with excellent concrete examples, particularly in the antipatterns section which provides both bad and good YAML patterns. The main weaknesses are moderate verbosity in the antipattern explanations and a lack of an explicit step-by-step workflow for writing new tests (the skill delegates this to external files without a clear sequence). The progressive disclosure is well-handled with appropriate references.
Suggestions
Add a brief numbered workflow for writing a new test (e.g., 1. Create directory, 2. Write chainsaw-test.yaml, 3. Add manifests in order, 4. Run with --fail-fast to verify) rather than fully delegating to external files.
Tighten the antipattern explanations — the 'Bad' code comments and brief rationale are sufficient; the paragraph-length explanations before each code block could be reduced to one sentence each.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The antipatterns section is thorough but somewhat verbose — the explanations of why each antipattern is bad (e.g., 'This command crashes with a type error when .status.conditions is nil') add useful context but could be tightened. The 'Bad/Good' code pairs are valuable but the surrounding prose is slightly padded. | 2 / 3 |
Actionability | Provides fully executable commands for running tests, concrete YAML examples for both correct and incorrect patterns, specific file structure with naming conventions, and copy-paste ready code blocks throughout. | 3 / 3 |
Workflow Clarity | The test structure and running commands are clear, but the writing workflow delegates to external files ('Reference tests/CLAUDE.md for comprehensive patterns' and 'see examples.md') without providing a clear step-by-step sequence for creating a new test. There's no explicit validation checkpoint for verifying a newly written test works correctly. | 2 / 3 |
Progressive Disclosure | Good structure with clear sections (Running, Writing, Structure, Antipatterns, Environment Variables). References to tests/CLAUDE.md and examples.md are one level deep and clearly signaled. The main file provides enough to get started while pointing to detailed resources. | 3 / 3 |
Total | 10 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
fc5746e
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.