ark-research

Research technical solutions by searching the web, examining GitHub repos, and gathering evidence. Use when exploring implementation options or evaluating technologies.

1.65x

Quality

62%

Does it follow best practices?

Impact

68%

1.65x

Average score across 3 eval scenarios

Securityby

Advisory

Suggest reviewing before use

Optimize this skill with Tessl

npx tessl skill review --optimize ./.claude/skills/research/SKILL.md

Quality

Discovery

67%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description is structurally sound with a clear 'what' and 'when' clause, which is its strongest aspect. However, the actions described are somewhat generic ('gathering evidence', 'searching the web') and the trigger terms could be more comprehensive to cover the natural language users would employ when needing technical research assistance. It would benefit from more specific actions and broader keyword coverage.

Suggestions

Add more specific concrete actions such as 'compare libraries, evaluate dependencies, review API documentation, benchmark alternatives'

Expand trigger terms to include natural user phrases like 'compare libraries', 'find a package', 'tech stack decision', 'look up docs', 'which framework should I use'

Dimension	Reasoning	Score
Specificity	Names the domain (technical research) and some actions ('searching the web', 'examining GitHub repos', 'gathering evidence'), but these are somewhat general and not highly concrete — e.g., 'gathering evidence' is vague. It doesn't list specific outputs or detailed operations.	2 / 3
Completeness	Clearly answers both 'what' (research technical solutions by searching the web, examining GitHub repos, gathering evidence) and 'when' (use when exploring implementation options or evaluating technologies) with an explicit 'Use when...' clause.	3 / 3
Trigger Term Quality	Includes some relevant keywords like 'searching the web', 'GitHub repos', 'implementation options', 'evaluating technologies', but misses common user phrases like 'compare libraries', 'find a package', 'look up documentation', 'tech stack', 'alternatives', or 'best practices'.	2 / 3
Distinctiveness Conflict Risk	The description is somewhat specific to technical research and web searching, but 'exploring implementation options' and 'evaluating technologies' could overlap with general coding assistance or architecture skills. The mention of GitHub repos and web searching helps differentiate it, but the scope is still fairly broad.	2 / 3
	Total	9 / 12 Passed

Implementation

57%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a moderately effective research skill that provides a clear structure and useful output template, but lacks the specificity needed for high actionability. The workflow is sequential but missing validation checkpoints for verifying research quality. Some sections explain things Claude would naturally do (asking users for content, creating directories), reducing token efficiency.

Suggestions

Add specific tool invocations (e.g., which MCP tools or commands to use for web search) rather than abstract 'web search' instructions to improve actionability.

Add a validation checkpoint after evidence gathering—e.g., 'Before presenting: verify each source is accessible, check repo last-commit dates, confirm at least 2 independent sources agree on the approach.'

Remove the 'Handle Blocked Content' section or reduce it to a single line—Claude naturally knows to ask users for inaccessible content, making this section low-value.

Dimension	Reasoning	Score
Conciseness	Generally efficient but includes some unnecessary explanation. The 'Handle Blocked Content' section explains obvious fallback strategies Claude would naturally employ. The 'Local Research Workspace' section's mkdir command is trivial. Some sections could be tightened.	2 / 3
Actionability	Provides concrete bash commands for cloning repos and creating directories, and a useful output format template. However, the core research steps are more procedural guidance than executable instructions—web search steps are vague ('web search: terminal recording library node typescript') and the skill lacks specific tool invocations or API calls Claude should use.	2 / 3
Workflow Clarity	Steps are clearly numbered and sequenced, and the example usage at the end walks through a concrete scenario. However, there are no validation checkpoints—no step to verify research quality before presenting, no feedback loop for when sources conflict, and no explicit criteria for when research is 'sufficient' beyond the vague '2-3 datapoints' mention.	2 / 3
Progressive Disclosure	For a skill of this size (~80 lines) with no bundle files, the content is well-organized into logical sections with clear headers. The structure flows naturally from process to output format to example, and no content feels like it needs to be split into separate files.	3 / 3
	Total	9 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: mckinsey/agents-at-scale-ark
Commit: 6b7c761

Reviewed: 5 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.