Use semantic consistency auditor for academic writing workflows that need structured execution, explicit assumptions, and clear output boundaries.
39
Quality
24%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./scientific-skills/Academic Writing/semantic-consistency-auditor/SKILL.mdQuality
Discovery
22%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This description fails to communicate what the skill actually does - it uses abstract buzzwords ('structured execution', 'explicit assumptions', 'clear output boundaries') instead of concrete actions. While it attempts to specify a domain (academic writing), the lack of specific capabilities and natural trigger terms makes it difficult for Claude to know when to select this skill over others.
Suggestions
Replace abstract language with concrete actions (e.g., 'Checks terminology consistency across sections, validates citation references, identifies contradictory claims in academic papers').
Add natural trigger terms users would say (e.g., 'Use when reviewing academic papers, checking manuscript consistency, or when user mentions thesis, dissertation, research paper, or scholarly writing').
Clarify what 'semantic consistency auditor' actually does - describe the specific checks or analyses it performs on academic documents.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description uses vague, abstract language like 'structured execution', 'explicit assumptions', and 'clear output boundaries' without describing any concrete actions the skill performs. No specific capabilities are listed. | 1 / 3 |
Completeness | The 'what' is extremely vague (no concrete actions described), and while there's a 'Use when' clause, it describes abstract conditions ('structured execution', 'explicit assumptions') rather than actionable triggers. Neither component is adequately addressed. | 1 / 3 |
Trigger Term Quality | Contains some relevant terms like 'academic writing' and 'semantic consistency auditor', but these are somewhat technical. Missing natural user phrases like 'check my paper', 'review consistency', or 'academic document'. | 2 / 3 |
Distinctiveness Conflict Risk | 'Academic writing' provides some niche focus, but 'semantic consistency auditor' is unclear and 'structured execution' could overlap with many workflow-oriented skills. The boundaries are not well-defined. | 2 / 3 |
Total | 6 / 12 Passed |
Implementation
27%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill suffers from severe verbosity and poor organization, with circular references, duplicated content, and unnecessary explanations of algorithms Claude already understands. While it provides some actionable code examples, the overall structure makes it difficult to follow and wastes significant token budget on redundant information. The workflow lacks integrated validation checkpoints critical for a tool dealing with model evaluation.
Suggestions
Remove redundant sections and circular references - consolidate 'When to Use' to a single clear statement, eliminate 'See ## X above/below' patterns by reorganizing content logically
Move the detailed algorithm explanations (BERTScore, COMET theory) to a separate reference file since Claude understands these concepts - keep only the configuration and usage specifics
Integrate validation checkpoints directly into the workflow steps (e.g., 'Validate model download completed before proceeding to evaluation')
Fix code block language tags from 'text' to 'bash' or 'python' and ensure installation commands match the declared dependencies in requirements.txt
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is extremely verbose with significant redundancy - the 'When to Use' section repeats the same description three times with minor variations, there are multiple sections that reference each other circularly ('See ## Prerequisites above', 'See ## Usage above'), and it includes unnecessary explanations Claude would already know (e.g., explaining what BERTScore and COMET are in detail). | 1 / 3 |
Actionability | The skill provides some concrete code examples for Python API usage and command-line invocation, but many code blocks are marked as 'text' instead of being executable, the installation instructions use 'pip install bertscore comet-ml' which differs from the requirements.txt dependencies listed, and the main.py script examples assume a structure that may not match the actual implementation. | 2 / 3 |
Workflow Clarity | There is a workflow section with numbered steps, but it lacks explicit validation checkpoints between steps. The 'Example run plan' provides a sequence but doesn't include feedback loops for error recovery. The error handling section exists but is separate from the workflow rather than integrated as validation gates. | 2 / 3 |
Progressive Disclosure | The document is a monolithic wall of text with poor organization - sections reference each other circularly ('See ## Prerequisites above' when Prerequisites appears later), content is duplicated across multiple sections (dependencies listed twice, usage examples scattered), and the single reference file mentioned provides minimal value for the document's length. | 1 / 3 |
Total | 6 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 10 / 11 Passed | |
4a48721
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.