ark-chainsaw-testing

Run and write Ark Chainsaw tests with mock-llm. Use for running tests, debugging failures, or creating new e2e tests.

1.37x

Quality

73%

Does it follow best practices?

Impact

95%

1.37x

Average score across 3 eval scenarios

Securityby

Advisory

Suggest reviewing before use

Optimize this skill with Tessl

npx tessl skill review --optimize ./.claude/skills/chainsaw/SKILL.md

Quality

Discovery

75%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description is well-structured with a clear 'what' and 'when' clause, and targets a very specific niche (Ark Chainsaw tests with mock-llm) that minimizes conflict risk. However, the actions described are somewhat high-level and could benefit from more concrete details about specific capabilities, and the trigger terms could include more natural variations users might use.

Suggestions

Add more specific concrete actions beyond 'run, write, debug' — e.g., 'configure mock-llm responses, analyze test output logs, set up test fixtures'

Include additional trigger term variations users might naturally say, such as 'end-to-end tests', 'integration tests', 'test failures', or 'chainsaw testing'

Dimension	Reasoning	Score
Specificity	Names the domain (Ark Chainsaw tests with mock-llm) and some actions (run, write, debug, create), but doesn't elaborate on what these tests involve or what specific capabilities are provided beyond high-level verbs.	2 / 3
Completeness	Clearly answers both what ('Run and write Ark Chainsaw tests with mock-llm') and when ('Use for running tests, debugging failures, or creating new e2e tests'), with explicit trigger scenarios.	3 / 3
Trigger Term Quality	Includes some relevant keywords like 'Ark Chainsaw tests', 'mock-llm', 'e2e tests', 'debugging failures', but these are fairly specialized terms. Missing common variations a user might say like 'integration tests', 'end-to-end', 'test failures', or 'chainsaw'.	2 / 3
Distinctiveness Conflict Risk	Very specific niche targeting 'Ark Chainsaw tests with mock-llm' — this is unlikely to conflict with other skills due to the highly specific tool/framework references.	3 / 3
	Total	10 / 12 Passed

Implementation

72%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a solid, actionable skill with excellent concrete examples — particularly the antipatterns section with its Bad/Good YAML comparisons. The main weakness is the lack of an explicit step-by-step workflow for writing new tests (the skill delegates this entirely to referenced files) and some verbosity in the antipattern explanations that could be tightened. The progressive disclosure is well-handled with appropriate references to supporting files.

Suggestions

Add a brief numbered workflow for writing a new test (e.g., 1. Create directory structure, 2. Write chainsaw-test.yaml, 3. Run with --fail-fast, 4. Debug with --skip-delete if needed) to improve workflow clarity.

Tighten the antipattern explanations — the 'why' context (e.g., 'a known kubectl bug: kubernetes/kubernetes#66439') is useful but the surrounding prose could be reduced by ~30% while preserving clarity.

Dimension	Reasoning	Score
Conciseness	The antipatterns section is thorough but somewhat verbose — the explanations of why each antipattern is bad (e.g., 'This command crashes with a type error when .status.conditions is nil') are useful but could be tightened. The 'Bad/Good' pairs are clear but the surrounding prose adds bulk. Overall mostly efficient with some room to trim.	2 / 3
Actionability	Provides fully executable bash commands for running tests, concrete YAML examples for both correct and incorrect patterns, specific directory structures, and copy-paste ready code blocks. Every section gives concrete, actionable guidance.	3 / 3
Workflow Clarity	The running tests section provides clear commands, and the test structure shows file ordering conventions. However, the end-to-end workflow for writing a new test lacks explicit sequencing with validation checkpoints — it delegates to external files (tests/CLAUDE.md, examples.md) without summarizing the steps. The antipatterns section implicitly teaches workflow but doesn't present a cohesive create-validate-run sequence.	2 / 3
Progressive Disclosure	The skill provides a concise overview with clear one-level-deep references to tests/CLAUDE.md for comprehensive patterns and examples.md for a working example. The main content stays focused on running tests, structure, and antipatterns while deferring detailed writing guidance appropriately.	3 / 3
	Total	10 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: mckinsey/agents-at-scale-ark
Commit: 6b7c761

Reviewed: 5 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.