CtrlK
BlogDocsLog inGet started
Tessl Logo

semantic-consistency-auditor

Use semantic consistency auditor for academic writing workflows that need structured execution, explicit assumptions, and clear output boundaries.

33

Quality

17%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./scientific-skills/Academic Writing/semantic-consistency-auditor/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

7%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This description is heavily padded with abstract buzzwords ('structured execution', 'explicit assumptions', 'clear output boundaries') that sound impressive but convey almost no concrete information about what the skill does. It fails to list specific actions, lacks natural trigger terms users would use, and does not clearly explain when Claude should select this skill over others.

Suggestions

Replace abstract language with concrete actions the skill performs, e.g., 'Checks for logical contradictions, inconsistent terminology, and conflicting claims across sections of academic papers.'

Add a 'Use when...' clause with natural trigger terms like 'check consistency', 'review paper', 'academic paper', 'terminology conflicts', 'logical contradictions'.

Remove buzzwords like 'structured execution' and 'clear output boundaries' that don't help Claude distinguish when to use this skill.

DimensionReasoningScore

Specificity

The description uses vague, abstract language like 'structured execution', 'explicit assumptions', and 'clear output boundaries' without listing any concrete actions the skill performs. There are no specific capabilities mentioned—just buzzwords.

1 / 3

Completeness

The 'what' is extremely vague—'semantic consistency auditor' doesn't explain what the skill actually does. The 'when' is present ('academic writing workflows that need...') but is so abstract it provides no actionable trigger guidance. Both components are very weak.

1 / 3

Trigger Term Quality

The terms 'semantic consistency auditor', 'structured execution', 'explicit assumptions', and 'clear output boundaries' are not natural phrases a user would say. Only 'academic writing' is somewhat natural, but the rest is jargon that users would not typically use in requests.

1 / 3

Distinctiveness Conflict Risk

The term 'semantic consistency auditor' is somewhat distinctive and unlikely to conflict with many other skills, but the vague framing around 'academic writing workflows' could overlap with other academic writing or editing skills without clear differentiation.

2 / 3

Total

5

/

12

Passed

Implementation

27%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill suffers from severe verbosity, poor organization, and redundancy. It contains useful domain-specific content (BERTScore/COMET configuration, CLI/API examples, input/output JSON schemas) but buries it under layers of generic boilerplate, repeated sections, and explanations of concepts Claude already knows. The cross-references are broken (pointing 'above' to sections that appear below), and the content would benefit enormously from splitting into a concise overview with references to detailed sub-documents.

Suggestions

Cut the content by at least 50%: remove the algorithm explanations (Claude knows BERTScore/COMET), generic boilerplate sections (Evaluation Criteria checkboxes, Output Requirements, Input Validation), and duplicate command blocks.

Fix the document structure: the 'See ## X above' references point to sections that appear later in the document. Reorganize so the flow is logical (Prerequisites → Installation → Quick Start → Usage → Configuration → Output Format).

Move detailed content (input/output JSON schemas, configuration YAML, Python API examples) into separate reference files and link to them from a concise overview, rather than inlining 300+ lines.

Add concrete validation steps after running the evaluation: e.g., 'Verify output JSON contains expected keys', 'Check that scores are within [0,1] range', 'If COMET model download fails, use --bert-only flag'.

DimensionReasoningScore

Conciseness

The skill is extremely verbose and repetitive. It contains multiple redundant sections (e.g., 'See ## Prerequisites above', 'See ## Usage above', 'See ## Workflow above' cross-references to sections that appear later), duplicate command blocks (py_compile appears 3 times), explains concepts Claude already knows (what BERTScore and COMET are, what precision/recall/F1 mean), and includes boilerplate sections like 'Evaluation Criteria' with generic checkboxes that add no value. The content could easily be cut by 60%+.

1 / 3

Actionability

The skill provides concrete CLI commands and Python API examples with specific model names and parameters, which is good. However, many code blocks use 'text' language markers instead of proper syntax highlighting, version numbers for dependencies are all 'unspecified', the install command uses 'pip install bertscore comet-ml' which may not match the actual package names, and the Python API import path ('from semantic_consistency_auditor import...') is not clearly tied to the repository structure. The examples are plausible but not verified as executable.

2 / 3

Workflow Clarity

There are multiple workflow sections that partially overlap ('Example Usage' run plan, 'Workflow' section, 'Response Template'). The main 'Workflow' section provides a reasonable 5-step process with fallback handling, but validation checkpoints are vague ('Validate that the request matches the documented scope') rather than concrete. The 'Quick Check' section provides a concrete validation command (py_compile), but there's no explicit feedback loop for fixing issues with the actual semantic evaluation output.

2 / 3

Progressive Disclosure

The skill is a monolithic wall of text with 300+ lines all inline. It mentions 'references/audit-reference.md' but dumps everything—algorithms explanation, configuration, input/output formats, performance notes, academic references, changelog, response templates—into a single file. The content is poorly organized with sections that reference other sections that appear later ('See ## Prerequisites above' when Prerequisites is below), creating a confusing reading order.

1 / 3

Total

6

/

12

Passed

Validation

90%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation10 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

10

/

11

Passed

Repository
aipoch/medical-research-skills
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.