CtrlK
BlogDocsLog inGet started
Tessl Logo

ark-research

Research technical solutions by searching the web, examining GitHub repos, and gathering evidence. Use when exploring implementation options or evaluating technologies.

85

1.65x
Quality

Does it follow best practices?

Impact

68%

1.65x

Average score across 3 eval scenarios

SecuritybySnyk

Advisory

Suggest reviewing before use

SKILL.md
Quality
Evals
Security

Quality

Content

85%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The body delivers concrete executable commands, a complete output template, and a clearly sequenced workflow with a real validation gate and feedback loop, organized into well-labeled sections. The only notable weakness is mild verbosity in a few framing lists that could be tightened.

Suggestions

Tighten the 'Look for:' and 'Save:' lists and the 'Store findings in ./scratch/research/ for review:' framing to trim tokens that assume Claude's competence less than needed.

Consider folding the duplicated 'insufficient evidence' guidance (the prose line and the quoted prompt) into a single concise checkpoint to reduce redundancy.

DimensionReasoningScore

Conciseness

Mostly efficient and task-focused (e.g. 'GitHub raw content is often blocked. Clone repos to examine them:'), but includes some lists and framing ('Store findings in ... for review', 'Look for:') that could be tightened; not score 3 because not every token earns its place, and not score 1 because it avoids padded explanations of concepts Claude already knows.

2 / 3

Actionability

Provides executable, copy-paste-ready commands ('git clone https://github.com/owner/repo.git', 'cat /tmp/repo/README.md', 'mkdir -p ./scratch/research') plus a complete markdown output template and a worked example trace; not score 2 because the guidance is concrete and complete rather than pseudocode.

3 / 3

Workflow Clarity

A clearly numbered five-step research sequence with an explicit validation gate ('Minimum 2-3 datapoints required before recommending') and a feedback loop ('If insufficient evidence, ask for guidance'); not score 2 because the validation checkpoint and retry path are explicit rather than implicit.

3 / 3

Progressive Disclosure

No bundle files exist and none are needed; content is organized into clearly labeled sections (Research Process, Output Format, Example Usage) within a single well-structured file, satisfying the simple-skill allowance for scoring 3 on organization alone; not score 2 because navigation is clear and not a monolithic wall of text.

3 / 3

Total

11

/

12

Passed

Description

92%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

A strong, third-person description that names concrete actions and includes an explicit 'Use when' trigger, covering both what and when clearly. Its main weakness is a moderately broad niche that could overlap with adjacent research or web-search skills.

DimensionReasoningScore

Specificity

Names three concrete actions — 'searching the web, examining GitHub repos, and gathering evidence' — matching the anchor for multiple specific concrete actions; not score 2 because it lists more than just a domain and partial actions.

3 / 3

Completeness

Answers both what ('Research technical solutions by searching the web...') and when with an explicit 'Use when exploring implementation options or evaluating technologies' clause; not score 2 because the when-trigger is explicit, not merely implied.

3 / 3

Trigger Term Quality

Natural terms a user would say ('web', 'GitHub repos', 'research', 'evaluating technologies', 'exploring implementation options') appear with good coverage; not score 2 because it is not missing common variations to a degree that weakens triggering.

3 / 3

Distinctiveness Conflict Risk

The 'research technical solutions / evaluating technologies' niche is somewhat specific but could overlap with a general web-search or coding skill; not score 3 because the triggers are not tightly distinct from related skills, and not score 1 because it is more specific than 'helps with code and documents'.

2 / 3

Total

11

/

12

Passed

Validation

93%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation15 / 16 Passed

Validation for skill structure

CriteriaDescriptionResult

relative_links

Relative link issues: 3 missing

Warning

Total

15

/

16

Passed

Repository
mckinsey/agents-at-scale-ark
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.