CtrlK
BlogDocsLog inGet started
Tessl Logo

he-spec

Create bounded, evidence-backed Harness Engineering specs from approved intent. Use when a selected issue, milestone, reframe phase, or execution slice needs acceptance criteria, traceability, risk gates, and validation boundaries before planning or implementation.

53

Quality

60%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Advisory

Suggest reviewing before use

Optimize this skill with Tessl

npx tessl skill review --optimize ./Plugins/harness-engineering/skills/he-spec/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

85%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a well-structured description that clearly defines both what the skill does and when to use it, with an explicit 'Use when' clause and specific deliverables. Its main weakness is the heavy use of domain-specific jargon that may not match natural user language, which could reduce discoverability if users phrase requests in more common terms. The description excels at distinctiveness and completeness but could benefit from including more natural trigger terms.

Suggestions

Add common natural-language synonyms alongside the domain jargon, e.g., 'technical specification', 'requirements', 'define scope', 'write a spec' to improve discoverability.

Consider briefly clarifying what 'Harness Engineering' refers to for disambiguation, or add alternative phrasings users might use when requesting this type of work.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: creating specs, acceptance criteria, traceability, risk gates, and validation boundaries. The description names a clear domain (Harness Engineering specs) and enumerates distinct deliverables.

3 / 3

Completeness

Clearly answers both 'what' (create bounded, evidence-backed Harness Engineering specs with acceptance criteria, traceability, risk gates, and validation boundaries) and 'when' (when a selected issue, milestone, reframe phase, or execution slice needs these before planning or implementation) with an explicit 'Use when' clause.

3 / 3

Trigger Term Quality

Includes some relevant domain-specific terms like 'acceptance criteria', 'risk gates', 'validation boundaries', 'specs', and 'execution slice', but these are fairly specialized jargon. Common user phrases like 'write a spec', 'define requirements', or 'technical specification' are missing, and terms like 'reframe phase' and 'Harness Engineering' are very niche and unlikely to be naturally said by most users.

2 / 3

Distinctiveness Conflict Risk

Highly distinctive due to the specific 'Harness Engineering specs' domain and the particular combination of triggers (approved intent, reframe phase, execution slice). This is unlikely to conflict with other skills given its narrow, well-defined niche.

3 / 3

Total

11

/

12

Passed

Implementation

35%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill is a complex, domain-specific process document that suffers from excessive verbosity and internal jargon, making it difficult to parse even for an advanced agent. While it demonstrates thoughtful workflow design with validation gates and conditional reference loading, the lack of concrete output examples, template snippets, or executable demonstrations significantly reduces actionability. The content reads more like an internal specification for a specification system than a clear, lean skill instruction.

Suggestions

Add a concrete, minimal example of a complete spec output (even abbreviated) showing the status block, frontmatter fields, BLUF paragraph, and acceptance IDs so Claude knows exactly what to produce.

Cut explanatory prose that describes the system's philosophy and meta-process (e.g., 'context-disposition policy', 'agent-native compression') and replace with terse, imperative instructions—aim to reduce total length by 40%.

Consolidate the 15+ conditional reference entries into a simple table with columns: Reference | When to Load | Path, reducing the verbose 'Read when' paragraph format.

Show a concrete example of the compact status block output format with actual field values rather than just listing field names.

DimensionReasoningScore

Conciseness

The skill is extremely verbose with dense jargon, internal system terminology, and process-heavy language that reads more like an internal design document than actionable instructions. Many sections explain meta-process (e.g., 'context-disposition policy', 'agent-native compression', 'pragmatic invariants') that add token cost without clear instructional value. Claude doesn't need lengthy explanations of when to read each of 15+ reference files.

1 / 3

Actionability

The procedure section provides a numbered sequence and references specific scripts (check_bluf_structure.py, check_generated_artifact_shape.py) with concrete command-line invocations. However, most guidance is abstract and process-descriptive rather than executable—there are no concrete examples of spec output format, no template snippets, and the output schema fields are listed but never shown in a concrete example structure.

2 / 3

Workflow Clarity

The 8-step procedure provides a clear sequence with some validation checkpoints (steps 4, 6, 7, 8 and the Validation section with specific scripts). However, the steps are dense and mix multiple concerns per step, validation gates are described abstractly ('pass/fail/blocked' without showing what that looks like), and the feedback loop for error recovery is implicit rather than explicit. The 'deepen spec' workflow is mentioned but not clearly sequenced.

2 / 3

Progressive Disclosure

The References section provides extensive one-level-deep references with clear 'Read when' conditions, which is good progressive disclosure design. However, the main body itself is a wall of dense text that could benefit from better structural organization, and without bundle files to verify, many referenced paths (spec-mode-rules.md, spec-artifact-contract.md, etc.) cannot be validated. The sheer volume of conditional references (15+) creates cognitive overload rather than clear navigation.

2 / 3

Total

7

/

12

Passed

Validation

90%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation10 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

metadata_version

'metadata.version' is missing

Warning

Total

10

/

11

Passed

Repository
jscraik/Agent-Skills
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.