CtrlK
BlogDocsLog inGet started
Tessl Logo

ark-chainsaw-testing

Run and write Ark Chainsaw tests with mock-llm. Use for running tests, debugging failures, or creating new e2e tests.

95

1.37x
Quality

Does it follow best practices?

Impact

95%

1.37x

Average score across 3 eval scenarios

SecuritybySnyk

Advisory

Suggest reviewing before use

SKILL.md
Quality
Evals
Security

Quality

Content

92%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The body is concise, highly actionable, and sequences risky operations with explicit validation checkpoints. The one gap is progressive disclosure: the two referenced files (examples.md, tests/CLAUDE.md) are not present in the skill bundle, so those links resolve to nothing.

Suggestions

Add the referenced examples.md (and tests/CLAUDE.md) to the skill bundle, or remove the dead links from the body, so navigation actually resolves.

Consider moving the three antipattern sections into a references/ file and keeping a one-line pointer in SKILL.md, since they account for most of the body's length.

Verify the manifest naming convention (a03/a04/a05) is documented in the referenced examples rather than only inline, so the inline overview stays a true overview.

DimensionReasoningScore

Conciseness

The body is lean with no boilerplate explaining what Chainsaw or Kubernetes is; each prose line (e.g. "This pattern converts a brittle timeout into an explicit condition check") earns its place by justifying a rule, fitting the 'lean and efficient' top anchor rather than the 'could be tightened' middle one.

3 / 3

Actionability

It provides copy-paste-ready commands (`chainsaw test --selector 'standard'`) and complete bad/good YAML examples, fully executable rather than pseudocode.

3 / 3

Workflow Clarity

Sequencing is explicit with a validation checkpoint — "Use assert only after wait has confirmed the query is done" shown as separate `wait-for-query-completion` then `validate-response` steps with timing rationale — matching the top anchor's clear sequence with checkpoints.

3 / 3

Progressive Disclosure

References are clearly signaled at one level ([examples.md](examples.md), `tests/CLAUDE.md`), but neither file exists in the bundle (no references/scripts/assets dirs and the files are absent), so the navigation is broken; per the bundle-structure guideline this caps the score below the top anchor.

2 / 3

Total

11

/

12

Passed

Description

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

A concise, specific description that states concrete actions, gives an explicit 'Use for...' trigger, and occupies a clearly distinct niche. It cleanly meets the top anchor on every dimension.

DimensionReasoningScore

Specificity

"Run and write Ark Chainsaw tests with mock-llm" plus "running tests, debugging failures, or creating new e2e tests" lists multiple specific concrete actions, matching the top anchor; it is not merely naming a domain with some actions.

3 / 3

Completeness

It explicitly states what ("Run and write Ark Chainsaw tests with mock-llm") and when via an explicit "Use for..." trigger clause ("Use for running tests, debugging failures, or creating new e2e tests"), satisfying both halves.

3 / 3

Trigger Term Quality

Natural terms a user would say are well covered — "running tests, debugging failures, e2e tests, Chainsaw tests" — going beyond a single technical label.

3 / 3

Distinctiveness Conflict Risk

"Ark Chainsaw" + "mock-llm" define a clear niche with distinct triggers unlikely to collide with other skills.

3 / 3

Total

12

/

12

Passed

Validation

93%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation15 / 16 Passed

Validation for skill structure

CriteriaDescriptionResult

relative_links

Relative link issues: 1 missing

Warning

Total

15

/

16

Passed

Repository
mckinsey/agents-at-scale-ark
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.