Generate closure-grade HE eval and drift proof for one execution slice. Use when Linear, milestone, or source-prompt closure needs validation evidence.
50
55%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./Plugins/harness-engineering/skills/he-eval-report/SKILL.mdImplementation is not completion. This skill writes closure proof for exactly
one approved Harness Engineering slice, with evidence for validation, drift,
side effects, traceability, generated media when relevant, and Linear closure
safety. Higher-priority instructions, command boundaries, and local AGENTS.md
guidance remain binding.
Selected slice, source .harness/{linear,reframes,decisions,core,strategy,triage,brainstorm,spec,plan,solutions}/
artifacts, implementation diff, validation output, branch/PR evidence, Linear
identifiers, proof artifacts, generated-media cache paths or repository media
paths when media proof is part of the slice.
Write one report at .harness/evals/YYYY-MM-DD-JSC-###-<repo>-<issue-or-milestone>-eval.md
when Linear context is known, or .harness/evals/YYYY-MM-DD-<repo>-<issue-or-milestone>-eval.md
otherwise. Include Artifact Identity frontmatter from
Plugins/harness-engineering/references/artifact-routing-contract.md and return
schema_version, evaluated slice, validation results, drift validation, proof
artifacts, closure recommendation, follow-up work, blockers, git staging
status, staged paths, source_prompt_family_status when source-prompt
closure is in scope, Codex provenance status, PR safety trace status, next
handoff, and confidence.
Non-trivial reports also include the BLUF review surface so the closure
recommendation, blocker consequence, and next action are visible before proof
detail.
Complete or Complete with follow-up as a Linear closure
recommendation.Run these from the repo root and record exact pass|fail|blocked outcomes.
Use each command with the report path argument:
python3 Plugins/harness-engineering/skills/he-eval-report/scripts/validate_eval_report.pypython3 Infrastructure/scripts/validation-and-linting/he_artifact_identity_lint.pypython3 Infrastructure/scripts/validation-and-linting/he_frontmatter_safety_lint.pyFor skill-package edits, also run strict skill audit, OpenClaw, OpenAI format,
progressive-disclosure lint, Plugin Eval, focused script tests, and smoke or
release eval listing/execution when available. Missing proof is not-run or
blocked, never pass. Fail fast: stop at the first failed gate, fix or
classify it, then rerun before proceeding to broader gates.
.harness/media/ PNG exists and
a sidecar records purpose, source cache path, repository path, prompt metadata,
linked context, and validation notes.If identifiers, source artifacts, validation evidence, report validation, media
files, Codex provenance required for a claim, or the evaluated slice cannot be
resolved, write the gap into the report,
classify closure safety as Blocked, Needs rework, or Unsafe to close, and
state the smallest repair before completion.
he-linear-plan after
explicit approval.Keep reports scannable in plain Markdown. Avoid color-only status, giant tables without surrounding prose, image-only proof, or conclusions that require reading unlinked logs.
Use the template in ../../references/skills/he-eval-report/eval-report-template.md plus the BLUF review
surface for non-trivial reports. Closure recommendation must be one of
Complete, Complete with follow-up, Blocked, Needs rework, or
Unsafe to close; do not use completion recommendations until steering is
complete.
Tie confidence to direct evidence, validator results, runtime proof, media persistence proof where relevant, and remaining unknowns. Cap confidence when strict audit, smoke/release evals, Plugin Eval, runtime visibility, Linear proof, or media persistence is failed or blocked.
.harness/media/..harness/media/."../../references/skills/he-eval-report/eval-report-contract.md,
../../references/skills/he-eval-report/eval-report-template.md, ../../references/skills/he-eval-report/eval-report-schema.json.../../references/skills/he-eval-report/drift-taxonomy.md,
../../references/skills/he-eval-report/linear-completion-policy.md.references/contract.yaml,
references/evals.yaml.references/source-prompt-preservation.md.../../references/bluf-review-contract.md.../../references/visual-reference-contract.md.../../references/codex-provenance-contract.md,
../../references/pr-safety-trace-contract.md.../../references/closure-mutation-contract.md.../../references/domain-context-contract.md,
../../references/domain-model-production-contract.md.../../references/subagent-call-contract.md.Plugins/harness-engineering/references/deferred-context-index.md.Apply the context-disposition policy: move important still-valid context to references, and intentionally discard stale, duplicated, unsafe, superseded, or low-signal text.
4c78f98
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.