CtrlK
BlogDocsLog inGet started
Tessl Logo

ark-chainsaw-testing

Run and write Ark Chainsaw tests with mock-llm. Use for running tests, debugging failures, or creating new e2e tests.

82

1.37x
Quality

73%

Does it follow best practices?

Impact

95%

1.37x

Average score across 3 eval scenarios

SecuritybySnyk

Advisory

Suggest reviewing before use

Optimize this skill with Tessl

npx tessl skill review --optimize ./.claude/skills/chainsaw/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

75%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description is well-structured with a clear 'what' and 'when' clause, and targets a very specific niche (Ark Chainsaw tests with mock-llm) that minimizes conflict risk. However, the actions described are somewhat high-level and could benefit from more concrete details about specific capabilities, and the trigger terms could include more natural variations users might use.

Suggestions

Add more specific concrete actions beyond 'run, write, debug' — e.g., 'configure mock-llm responses, analyze test output logs, set up test fixtures'

Include additional trigger term variations users might naturally say, such as 'end-to-end tests', 'integration tests', 'test failures', or 'chainsaw testing'

DimensionReasoningScore

Specificity

Names the domain (Ark Chainsaw tests with mock-llm) and some actions (run, write, debug, create), but doesn't elaborate on what these tests involve or what specific capabilities are provided beyond high-level verbs.

2 / 3

Completeness

Clearly answers both what ('Run and write Ark Chainsaw tests with mock-llm') and when ('Use for running tests, debugging failures, or creating new e2e tests'), with explicit trigger scenarios.

3 / 3

Trigger Term Quality

Includes some relevant keywords like 'Ark Chainsaw tests', 'mock-llm', 'e2e tests', 'debugging failures', but these are fairly specialized terms. Missing common variations a user might say like 'integration tests', 'end-to-end', 'test failures', or 'chainsaw'.

2 / 3

Distinctiveness Conflict Risk

Very specific niche targeting 'Ark Chainsaw tests with mock-llm' — this is unlikely to conflict with other skills due to the highly specific tool/framework references.

3 / 3

Total

10

/

12

Passed

Implementation

72%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a solid, actionable skill with excellent concrete examples — particularly the antipatterns section with its Bad/Good YAML comparisons. The main weakness is the lack of an explicit step-by-step workflow for writing new tests (the skill delegates this entirely to referenced files) and some verbosity in the antipattern explanations that could be tightened. The progressive disclosure is well-handled with appropriate references to supporting files.

Suggestions

Add a brief numbered workflow for writing a new test (e.g., 1. Create directory structure, 2. Write chainsaw-test.yaml, 3. Run with --fail-fast, 4. Debug with --skip-delete if needed) to improve workflow clarity.

Tighten the antipattern explanations — the 'why' context (e.g., 'a known kubectl bug: kubernetes/kubernetes#66439') is useful but the surrounding prose could be reduced by ~30% while preserving clarity.

DimensionReasoningScore

Conciseness

The antipatterns section is thorough but somewhat verbose — the explanations of why each antipattern is bad (e.g., 'This command crashes with a type error when .status.conditions is nil') are useful but could be tightened. The 'Bad/Good' pairs are clear but the surrounding prose adds bulk. Overall mostly efficient with some room to trim.

2 / 3

Actionability

Provides fully executable bash commands for running tests, concrete YAML examples for both correct and incorrect patterns, specific directory structures, and copy-paste ready code blocks. Every section gives concrete, actionable guidance.

3 / 3

Workflow Clarity

The running tests section provides clear commands, and the test structure shows file ordering conventions. However, the end-to-end workflow for writing a new test lacks explicit sequencing with validation checkpoints — it delegates to external files (tests/CLAUDE.md, examples.md) without summarizing the steps. The antipatterns section implicitly teaches workflow but doesn't present a cohesive create-validate-run sequence.

2 / 3

Progressive Disclosure

The skill provides a concise overview with clear one-level-deep references to tests/CLAUDE.md for comprehensive patterns and examples.md for a working example. The main content stays focused on running tests, structure, and antipatterns while deferring detailed writing guidance appropriately.

3 / 3

Total

10

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository
mckinsey/agents-at-scale-ark
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.