CtrlK
BlogDocsLog inGet started
Tessl Logo

ark-chainsaw-testing

Run and write Ark Chainsaw tests with mock-llm. Use for running tests, debugging failures, or creating new e2e tests.

81

4.84x
Quality

73%

Does it follow best practices?

Impact

92%

4.84x

Average score across 3 eval scenarios

SecuritybySnyk

Advisory

Suggest reviewing before use

Optimize this skill with Tessl

npx tessl skill review --optimize ./.claude/skills/chainsaw/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

75%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description is functional and clearly scoped to a specific tool (Ark Chainsaw with mock-llm), which makes it distinctive and unlikely to conflict with other skills. It successfully covers both 'what' and 'when' dimensions. However, it could benefit from more specific action descriptions and additional natural trigger terms to improve discoverability.

Suggestions

Expand the specific actions listed, e.g., 'configure mock-llm responses, analyze test output, set up test fixtures' to improve specificity.

Add natural keyword variations users might use, such as 'end-to-end tests', 'test failures', 'chainsaw test suite', or 'mock LLM'.

DimensionReasoningScore

Specificity

Names the domain (Ark Chainsaw tests with mock-llm) and some actions (run, write, debug, create), but doesn't elaborate on what these tests involve or what specific capabilities are provided (e.g., configuring mock responses, analyzing test output, setting up fixtures).

2 / 3

Completeness

Clearly answers both 'what' (run and write Ark Chainsaw tests with mock-llm) and 'when' ('Use for running tests, debugging failures, or creating new e2e tests'), providing explicit trigger scenarios.

3 / 3

Trigger Term Quality

Includes some relevant terms like 'Ark Chainsaw tests', 'mock-llm', 'e2e tests', 'debugging failures', and 'running tests'. However, it misses natural variations users might say such as 'end-to-end tests', 'test failures', 'chainsaw test suite', or 'mock LLM responses'.

2 / 3

Distinctiveness Conflict Risk

Very specific niche targeting 'Ark Chainsaw tests with mock-llm' — this is unlikely to conflict with other skills due to the highly specific product/tool name and testing context.

3 / 3

Total

10

/

12

Passed

Implementation

72%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-structured, concise skill that excels at progressive disclosure and token efficiency. Its main weakness is that the 'Writing Tests' section lacks any inline actionable guidance, fully deferring to external files, which leaves the skill somewhat incomplete as a standalone reference for test creation. The running tests section is strong with concrete commands.

Suggestions

Add a minimal inline example or key steps for writing a new test (e.g., the essential chainsaw-test.yaml structure) so the skill is actionable without requiring external file reads.

Include a brief sequential workflow for creating a new test: create directory → write manifests → write test definition → run and validate.

DimensionReasoningScore

Conciseness

The content is lean and efficient. Every section provides only essential information without explaining concepts Claude already knows. No unnecessary preamble or padding.

3 / 3

Actionability

Running tests has concrete, copy-paste-ready commands. However, the 'Writing Tests' section delegates entirely to external files without providing any inline actionable guidance, and the test structure section is descriptive rather than instructive.

2 / 3

Workflow Clarity

The running tests section provides clear commands including debug mode, but there's no explicit workflow sequence for writing tests (e.g., steps to create a new test, validate it, run it). The writing process is entirely deferred to external references.

2 / 3

Progressive Disclosure

Content is well-structured with a concise overview and clear one-level-deep references to `tests/CLAUDE.md` and `examples.md` for detailed patterns. The skill appropriately keeps the overview brief and delegates comprehensive content.

3 / 3

Total

10

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository
mckinsey/agents-at-scale-ark
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.