CtrlK
BlogDocsLog inGet started
Tessl Logo

ark-research

Research technical solutions by searching the web, examining GitHub repos, and gathering evidence. Use when exploring implementation options or evaluating technologies.

67

1.65x
Quality

58%

Does it follow best practices?

Impact

68%

1.65x

Average score across 3 eval scenarios

SecuritybySnyk

Advisory

Suggest reviewing before use

Optimize this skill with Tessl

npx tessl skill review --optimize ./.claude/skills/research/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

67%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description is structurally sound with a clear 'what' and 'when' clause, earning full marks on completeness. However, the actions described are moderately specific rather than highly concrete, and the trigger terms could be expanded to cover more natural user phrasings like 'compare libraries', 'find a package', or 'tech stack comparison'. The skill could also be more distinctive to avoid overlap with general web search skills.

Suggestions

Add more specific concrete actions such as 'compare libraries, read documentation, analyze GitHub stars/issues, benchmark alternatives'

Expand trigger terms in the 'Use when' clause to include natural user phrases like 'compare libraries', 'find a package', 'tech stack', 'best practices', 'alternatives'

DimensionReasoningScore

Specificity

Names the domain (technical research) and some actions ('searching the web', 'examining GitHub repos', 'gathering evidence'), but these are somewhat general and not highly concrete — e.g., 'gathering evidence' is vague. It doesn't list specific outputs or detailed operations.

2 / 3

Completeness

Clearly answers both 'what' (research technical solutions by searching the web, examining GitHub repos, gathering evidence) and 'when' (use when exploring implementation options or evaluating technologies) with an explicit 'Use when...' clause.

3 / 3

Trigger Term Quality

Includes some relevant terms like 'searching the web', 'GitHub repos', 'implementation options', 'evaluating technologies', but misses common user phrases like 'compare libraries', 'find a package', 'look up documentation', 'tech stack', 'alternatives', or 'best practices'.

2 / 3

Distinctiveness Conflict Risk

The scope is somewhat distinct (technical research via web/GitHub), but 'searching the web' and 'gathering evidence' could overlap with general web search or research skills. The focus on 'implementation options' and 'technologies' helps narrow it, but not enough to be fully distinctive.

2 / 3

Total

9

/

12

Passed

Implementation

50%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill provides a reasonable framework for technical research with a clear process flow and useful output template. However, it leans toward describing what to do rather than providing precise, tool-specific instructions (e.g., which search tool to invoke, exact parameters). The evidence requirements section is a good constraint but lacks a verification mechanism to enforce it.

Suggestions

Add specific tool invocations for web search (e.g., the exact search tool/function Claude should use) rather than just saying 'web search first'

Add a validation checkpoint before the recommendation step that explicitly verifies the 2-3 datapoint minimum is met, with a decision branch for insufficient evidence

Trim obvious advice like 'Look for: README documentation, Code examples' which Claude already knows to examine when reviewing repositories

DimensionReasoningScore

Conciseness

Mostly efficient but includes some unnecessary explanation. Steps like 'Handle Blocked Content' with example prompts and the general advice to 'look for README documentation, code examples' are somewhat obvious for Claude. The output format template is useful but slightly verbose.

2 / 3

Actionability

Provides some concrete commands (git clone, mkdir) and a clear output template, but much of the guidance is procedural description rather than executable. The 'web search first' step lacks specific tool invocations or search query patterns. The example usage at the end is a helpful walkthrough but still fairly high-level.

2 / 3

Workflow Clarity

Steps are clearly numbered and sequenced, but there are no validation checkpoints or feedback loops. For a research workflow, there's no explicit step to verify source quality, cross-reference findings, or validate that the minimum evidence threshold is actually met before proceeding to recommendations.

2 / 3

Progressive Disclosure

Content is reasonably well-structured with clear sections, but everything is inline in a single file. The output format template and example usage could potentially be separated. For a skill of this length (~80 lines), the organization is adequate but the lack of any external references or layered structure keeps it from a 3.

2 / 3

Total

8

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository
mckinsey/agents-at-scale-ark
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.