CtrlK
BlogDocsLog inGet started
Tessl Logo

ark-pentest-issue-resolver

Resolve common penetration testing issues in Ark. Use when fixing security vulnerabilities from pentest reports, security audits, or OWASP Top 10 issues.

84

1.04x
Quality

Does it follow best practices?

Impact

96%

1.04x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

SKILL.md
Quality
Evals
Security

Quality

Content

65%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The content is highly actionable with comprehensive, executable detection and mitigation patterns and a sensible workflow, but it is a long monolith that re-explains familiar concepts and lacks progressive disclosure or explicit error-recovery feedback loops.

Suggestions

Move the 15 detailed issue categories into one-level-deep reference files (e.g. REFERENCES/injection.md, REFERENCES/xss.md) and keep SKILL.md as a concise overview with links, to improve progressive disclosure and conciseness.

Remove or trim the per-issue "Description" lines that re-explain well-known vulnerability concepts Claude already knows.

Turn Step 6 (Test the Fixes) into an explicit validate-fix-retry feedback loop with concrete pass/fail criteria and re-run commands, since security fixes can break functionality.

DimensionReasoningScore

Conciseness

The ~1000-line body is mostly efficient actionable code, but per-issue "Description" fields re-explain well-known concepts (e.g. what XSS, CSRF, SQL injection are) that Claude already knows, fitting the score-2 anchor of mostly efficient with some unnecessary explanation.

2 / 3

Actionability

Each category supplies executable vulnerable/secure code using real libraries (defusedxml, DOMPurify, gorilla/csrf, flask-limiter, pydantic), plus concrete grep detection commands and checklists, matching the fully-executable score-3 anchor.

3 / 3

Workflow Clarity

The 7-step workflow is sequenced with an approval gate (Step 4) and a test step (Step 6), but it lacks an explicit validate-fix-retry feedback loop for potentially destructive security changes, so it is capped at 2 per the destructive-operation scoring note.

2 / 3

Progressive Disclosure

No bundle files exist and the entire ~1000-line catalog lives inline in SKILL.md; sections are well-organized but detailed category content that should be split into one-level-deep references is inline, matching the score-2 anchor.

2 / 3

Total

9

/

12

Passed

Description

90%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description is clear, third-person, and answers both what and when with strong, natural trigger terms. Its only weakness is that the action statement is a single general verb rather than a list of concrete capabilities.

Suggestions

Expand the action statement to list concrete capabilities, e.g. "Detect vulnerable code patterns, apply mitigations, and verify fixes for common penetration testing issues in Ark."

Consider adding the specific issue families (e.g. SQL injection, XSS, CSRF) to the description to further sharpen specificity.

DimensionReasoningScore

Specificity

The phrase "Resolve common penetration testing issues in Ark" names the domain and a single action but does not enumerate multiple concrete actions (e.g. detect, mitigate, patch), matching the score-2 anchor rather than the multi-action score-3 anchor.

2 / 3

Completeness

It states what ("Resolve common penetration testing issues in Ark") and gives an explicit "Use when fixing..." trigger clause, satisfying both what and when per the score-3 anchor.

3 / 3

Trigger Term Quality

"pentest reports, security audits, or OWASP Top 10 issues" plus "security vulnerabilities" gives good coverage of natural terms a user would say, matching the score-3 anchor.

3 / 3

Distinctiveness Conflict Risk

The niche is narrow (penetration-testing findings / OWASP Top 10 in Ark) with distinct triggers, making overlap with unrelated skills unlikely per the score-3 anchor.

3 / 3

Total

11

/

12

Passed

Validation

93%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation15 / 16 Passed

Validation for skill structure

CriteriaDescriptionResult

skill_md_line_count

SKILL.md is long (1013 lines); consider splitting into references/ and linking

Warning

Total

15

/

16

Passed

Repository
mckinsey/agents-at-scale-ark
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.