he-improve

Improve existing Harness Engineering implementations or workflows with evidence-backed changes. Use when users ask for targeted enhancement of shipped or drafted work.

Quality

26%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./Plugins/harness-engineering/fixtures/budget-archive/2026-04-21/deferred-store/skills/team_automation/he-improve/SKILL.md

Quality

Discovery

17%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This description is too vague and abstract to be useful for skill selection. It fails to explain what 'Harness Engineering' is, what concrete actions the skill performs, or what natural user requests should trigger it. The buzzword-heavy language ('evidence-backed changes', 'targeted enhancement') adds no discriminative value.

Suggestions

Define what 'Harness Engineering' means and list specific concrete actions the skill performs (e.g., 'optimize CI/CD pipelines, refactor deployment configurations, improve test harness coverage').

Add natural trigger terms users would actually say, such as specific technologies, file types, or task names (e.g., 'Harness CI/CD, pipeline optimization, .harness config files, deployment workflows').

Make the 'Use when...' clause more explicit with concrete scenarios, e.g., 'Use when users want to optimize Harness pipeline YAML, reduce build times, or refactor deployment stages.'

Dimension	Reasoning	Score
Specificity	The description uses vague language like 'improve existing implementations or workflows' and 'evidence-backed changes' without listing any concrete actions. It doesn't specify what kinds of improvements, what 'Harness Engineering' entails, or what specific operations are performed.	1 / 3
Completeness	It has a weak 'what' (improve implementations/workflows) and does include a 'Use when...' clause ('when users ask for targeted enhancement of shipped or drafted work'), but both are vague and lack explicit, actionable triggers.	2 / 3
Trigger Term Quality	The terms 'Harness Engineering implementations', 'evidence-backed changes', and 'shipped or drafted work' are not natural phrases users would say. There are no common trigger words a user would naturally use when needing this skill.	1 / 3
Distinctiveness Conflict Risk	The description is very generic — 'improve existing implementations or workflows' could apply to virtually any code improvement, refactoring, or optimization skill. Without specifics about what 'Harness Engineering' means or what domain it covers, this would easily conflict with other improvement/enhancement skills.	1 / 3
	Total	5 / 12 Passed

Implementation

35%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill reads as a high-level process framework rather than actionable guidance. It establishes good philosophical principles (evidence-based optimization, bounded experiments, fail-fast gates) and identifies useful anti-patterns, but fails to provide any concrete, executable steps — no commands, no code, no specific tool invocations. The domain-specific terminology ('Harness Engineering', 'context-disposition policy', 'measurement harness') is used without definition, making the skill opaque to anyone not already deeply familiar with the system.

Suggestions

Add concrete, executable examples for key procedure steps — e.g., show the actual commands to create an optimization spec, run a measurement baseline, and compare experiment results.

Replace the natural language 'Examples' section with at least one complete input/output walkthrough showing the spec format, measurement output, and decision outcome.

Define or link to definitions of domain-specific terms like 'Harness Engineering', 'measurement harness', 'context-disposition policy', and 'session-collector' so the skill is self-contained.

Fix the deeply nested relative path (../../../../../../references/session-evidence-contract.md) — either include the referenced file in the bundle or use a more stable path convention.

Dimension	Reasoning	Score
Conciseness	The content is moderately efficient but includes some unnecessary padding. Phrases like 'Progressive Disclosure Entry' and 'This entrypoint stays concise and keeps full operational context in archived references' are meta-commentary. The procedure steps are reasonably tight but use domain jargon ('Harness Engineering', 'session-collector', 'context-disposition policy') without concrete definitions, creating verbosity without clarity. Some sections like 'Full Context' with just icon references add little value.	2 / 3
Actionability	The skill is almost entirely abstract and descriptive. There are no concrete commands, executable code, specific file paths to edit, or copy-paste ready examples. The procedure steps describe what to do conceptually ('Load or create the optimization spec', 'Run bounded iterations') but never show how. The 'Examples' section contains natural language prompts rather than input/output demonstrations with concrete artifacts.	1 / 3
Workflow Clarity	The procedure provides a numbered sequence with validation checkpoints and the validation section includes explicit gates ('Fail fast: stop at first failed gate'). However, the steps are abstract rather than concrete — there are no specific commands, tools, or file operations shown. The feedback loop concept (keep/revise/discard based on measured outcomes) is present but implicit rather than demonstrated with concrete validation commands.	2 / 3
Progressive Disclosure	The skill references an external file (session-evidence-contract.md) with a relative path, suggesting some progressive disclosure structure. However, no bundle files were provided to verify the reference exists, the path uses deeply nested relative navigation (../../../../../../), and the 'Full Context' section only links to icon assets rather than substantive reference materials. The skill claims to keep 'full operational context in archived references' but doesn't clearly signal where those references are.	2 / 3
	Total	7 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
metadata_version	'metadata.version' is missing	Warning

	Total	10 / 11 Passed

Repository: jscraik/Agent-Skills
Commit: 4c78f98

Reviewed: about 22 hours ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.