he-eval-report

Generate closure-grade HE eval and drift proof for one execution slice. Use when Linear, milestone, or source-prompt closure needs validation evidence.

Quality

85%

Does it follow best practices?

Run evals on this skill

Adds up to 20 points to the overall score

View guide

Securityby

Passed

No findings from the security scan

Quality

Content

85%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The content is highly actionable with concrete commands, a clear sequenced procedure, and strong validation feedback loops, supported by a well-organized one-level reference structure. Its only notable weakness is conciseness, with jargon padding and repeated caveats that could be consolidated.

Suggestions

Consolidate the repeated "missing validation is not a pass" / redaction / not-proof caveats that recur across Validation, Evidence Requirements, and Gotchas into a single Safety/Proof section.

Trim jargon-packed sentences (e.g. "first-principles, XP, gate-selection, plugin-hook capability, domain-model... checks") to the checks actually relevant, or defer the enumeration to a referenced contract.

Reduce the Inputs/Outputs field lists by moving the full Artifact Identity field set to the referenced template rather than enumerating every field inline.

Dimension	Reasoning	Score
Conciseness	The body is dense (~210 lines) with jargon-heavy phrasing and some redundancy (e.g. "Missing validation is not a pass" restated across Validation, Evidence Requirements, and Gotchas); it is mostly efficient but could be tightened and assumes more domain fluency than it earns.	2 / 3
Actionability	Validation lists exact executable commands with full repo-root paths, Outputs specifies a concrete report path pattern, and Procedure steps name specific contract files and required fields, giving copy-paste-ready guidance.	3 / 3
Workflow Clarity	An 11-step numbered Procedure has explicit validation checkpoints ("never invent passing results", "Fail fast: stop at the first failed gate") with pass\|fail\|blocked outcomes and retry-before-proceed feedback loops appropriate for destructive/batch eval work.	3 / 3
Progressive Disclosure	The References section groups contracts by purpose ("Read when writing reports", "Read when classifying drift") with one-level-deep, clearly signaled links; local references (contract.yaml, evals.yaml, source-prompt-preservation.md) are verified real bundle files, and the body stays an overview pointing to detail.	3 / 3
	Total	11 / 12 Passed

Description

85%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description is concise, third-person, and clearly pairs a concrete capability with an explicit use trigger. Its main weakness is trigger-term coverage leaning on internal jargon rather than the full range of natural phrasings a user might say.

Suggestions

Broaden trigger terms with natural phrasings a user would actually say, e.g. "closure evidence", "is this safe to close", "completion proof", alongside the current Linear/milestone/source-prompt terms.

Soften or gloss the jargon "HE eval" and "execution slice" so a non-specialist user's request can still match the skill.

Dimension	Reasoning	Score
Specificity	"Generate closure-grade HE eval and drift proof" lists multiple concrete actions (generate eval report, generate drift proof) scoped to one execution slice, matching the multiple-specific-actions anchor.	3 / 3
Completeness	It states what it does ("Generate closure-grade HE eval and drift proof") and gives an explicit "Use when..." trigger for when Claude should invoke it.	3 / 3
Trigger Term Quality	"Linear, milestone, or source-prompt closure" surfaces natural user terms, but "HE eval" and "execution slice" are domain jargon with missing common variations, so coverage is partial.	2 / 3
Distinctiveness Conflict Risk	The niche is narrow (closure proof for one HE slice) with distinct triggers tied to Linear/milestone/source-prompt closure, making misfires for other skills unlikely.	3 / 3
	Total	11 / 12 Passed

Validation

93%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 15 / 16 Passed

Validation for skill structure

Criteria	Description	Result
metadata_version	'metadata.version' is missing	Warning

	Total	15 / 16 Passed

Repository: jscraik/Agent-Skills
Path: Plugins/harness-engineering/skills/he-eval-report/SKILL.md
Commit: 4f7075e

Reviewed: 1 day ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.