Verify claims in backlog items, skill documentation, or plugin content against primary sources using web lookups. Spawns parallel verification agents that MUST use WebFetch/WebSearch/gh — training data recall is explicitly rejected as evidence. Produces VERIFIED/REFUTED/INCONCLUSIVE verdicts with citations. Triggers on "fact check", "verify claims", "check against primary sources", or when backlog items are marked UNVERIFIED.
67
81%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Advisory
Suggest reviewing before use
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is an excellent skill description that clearly communicates specific capabilities, explicit trigger conditions, and a distinctive purpose. It uses third person voice throughout, lists concrete actions and outputs, and provides natural trigger terms. The description is comprehensive without being unnecessarily verbose.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: verify claims in backlog items/skill documentation/plugin content, spawns parallel verification agents, uses WebFetch/WebSearch/gh, produces VERIFIED/REFUTED/INCONCLUSIVE verdicts with citations. Very detailed about what it does and how. | 3 / 3 |
Completeness | Clearly answers both 'what' (verify claims against primary sources using web lookups, produce verdicts with citations) and 'when' (explicit triggers: 'fact check', 'verify claims', 'check against primary sources', or when items are marked UNVERIFIED). | 3 / 3 |
Trigger Term Quality | Includes natural trigger terms users would say: 'fact check', 'verify claims', 'check against primary sources', and the contextual trigger 'UNVERIFIED'. These cover the most common ways a user would request this functionality. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive niche — fact-checking and claim verification against primary sources is a very specific function unlikely to overlap with other skills. The explicit mention of verification verdicts (VERIFIED/REFUTED/INCONCLUSIVE) and the rejection of training data as evidence further distinguish it. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
62%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-structured skill with clear workflow sequencing and a strong evidence protocol that explicitly rejects training data recall. Its main weaknesses are that the core verification actions lack concrete, executable examples (how exactly to invoke WebFetch, WebSearch, or spawn agents), and the mermaid diagrams add token cost without proportional value for Claude. The CoVe feedback loop is a notable strength for preventing confirmation bias.
Suggestions
Replace mermaid flowcharts with concise numbered steps or bullet lists — Claude processes text more efficiently than diagram syntax, and this would save significant tokens.
Add concrete, executable examples of the actual tool invocations (e.g., a real WebFetch call, a real WebSearch query, a real `gh api` command) so agents know exactly what to run rather than just seeing method names.
Show a concrete example of spawning a '@fact-checker' agent — is this a Task tool call, a subprocess, or something else? The mechanism is never specified.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is reasonably well-structured but includes some content that could be tightened — the mermaid diagrams add visual bulk without much value for Claude (who processes text), and some sections like 'When NOT to Use' and the evidence rules preamble ('Adapted from the find-cause evidence chain protocol') are somewhat padded. The valid/invalid evidence lists are valuable but could be more compact. | 2 / 3 |
Actionability | The skill provides structured templates (verdict format, agent prompt, report format) and specific post-action commands, which is good. However, the core verification process relies on spawning '@fact-checker' agents without explaining how to actually invoke them (no concrete tool calls, no executable code for WebFetch/WebSearch), and the claim extraction process is described abstractly via flowcharts rather than concrete steps or commands. | 2 / 3 |
Workflow Clarity | The multi-step workflow is clearly sequenced: claim extraction → classification → wave spawning → verdict collection → report generation → post-actions. The Chain of Verification (CoVe) requirement adds an explicit validation/feedback loop within each verification agent. The wave execution pattern with sequential batches of 5 is well-defined, and post-actions include linting and committing. | 3 / 3 |
Progressive Disclosure | The skill references four external skills (find-cause, research-curator, skill-research-process, cove-prompt-design) with relative paths, which is good for navigation. However, no bundle files are provided to verify these references exist, and the skill itself is fairly long (~150 lines of substantive content) with some sections like the full report template that could potentially be split into a reference file. The references section is one-level deep, which is appropriate. | 2 / 3 |
Total | 9 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 10 / 11 Passed | |
4e61312
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.