CtrlK
BlogDocsLog inGet started
Tessl Logo

custom-index-eval

Iterative evaluation of Fusion Framework MCP search quality against documented domain patterns. Loads domain files from eval/index/, queries Fusion MCP for each pattern, validates recall against must/should requirements, and produces a human-readable pass/fail report. USE FOR: eval core, eval all, evaluate MCP index accuracy, validate search recall for a domain, check index freshness. DO NOT USE FOR: writing domain patterns, populating eval/index/ files, running CI pipelines, or batch automation.

72

Quality

88%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Advisory

Suggest reviewing before use

SKILL.md
Quality
Evals
Security

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that clearly articulates specific capabilities, provides explicit trigger terms via a USE FOR clause, and reduces conflict risk with a DO NOT USE FOR exclusion list. The description is concise yet comprehensive, covering the full evaluation workflow from loading domain files to producing pass/fail reports. The only minor note is the domain-specific terminology (Fusion Framework MCP) which is appropriate given the specialized nature of the skill.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: loads domain files from eval/index/, queries Fusion MCP for each pattern, validates recall against must/should requirements, and produces a human-readable pass/fail report.

3 / 3

Completeness

Clearly answers both 'what' (iterative evaluation of Fusion Framework MCP search quality, loads domain files, queries, validates recall, produces report) and 'when' (explicit USE FOR clause with trigger phrases, plus DO NOT USE FOR exclusions).

3 / 3

Trigger Term Quality

Includes strong natural trigger terms users would say: 'eval core', 'eval all', 'evaluate MCP index accuracy', 'validate search recall', 'check index freshness'. Also includes negative triggers (DO NOT USE FOR) which help prevent mismatches.

3 / 3

Distinctiveness Conflict Risk

Highly distinctive with a clear niche: Fusion Framework MCP search quality evaluation. The explicit DO NOT USE FOR clause further reduces conflict risk by excluding adjacent tasks like writing domain patterns or running CI pipelines.

3 / 3

Total

12

/

12

Passed

Implementation

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-structured, actionable skill with a clear multi-step workflow, concrete tool references, and a thorough worked example. Its main weaknesses are moderate verbosity (some sections could be tightened) and unverifiable external references since no bundle files were provided. The safety constraints and edge-case handling are strong.

Suggestions

Trim the 'When not to use' section to remove future-phase items that don't help Claude make decisions now, and condense 'Required inputs' since Claude can infer defaults from the instructions.

Provide the referenced bundle files (agents/query-judge.md, assets/report-template.md) or note their absence — without them, the sub-agent delegation in Step 3 and report template in Step 4 are not fully actionable.

DimensionReasoningScore

Conciseness

The skill is mostly efficient but includes some unnecessary sections like 'When not to use' with future-phase items, and the 'Required inputs' section restates things Claude could infer. The example is valuable but slightly verbose. Some trimming is possible without losing clarity.

2 / 3

Actionability

The skill provides concrete, step-by-step instructions with specific file paths, MCP tool names (`mcp_fusion_search_framework`), parsing rules for domain files (`must`/`should` bullets, `##` headings), a complete example with a realistic report table, and a clear fallback for when sub-agents aren't available.

3 / 3

Workflow Clarity

The 5-step workflow is clearly sequenced with explicit handling of edge cases (missing files, no sub-agent support, MCP unavailability). Validation is embedded: ambiguous results must be marked 'partial', fabricated passes are prohibited, and MCP failures trigger a clear stop. The collect-all-then-report pattern prevents premature output.

3 / 3

Progressive Disclosure

The skill references external files (`agents/query-judge.md`, `assets/report-template.md`, `eval/index/README.md`) which is good progressive disclosure design, but no bundle files are provided to verify these exist. The main SKILL.md itself is somewhat long and could benefit from moving the detailed example or the domain file parsing spec into a referenced file.

2 / 3

Total

10

/

12

Passed

Validation

90%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation10 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

metadata_field

'metadata' should map string keys to string values

Warning

Total

10

/

11

Passed

Repository
equinor/fusion-framework
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.