CtrlK
BlogDocsLog inGet started
Tessl Logo

he-tdd

Build behavior-safe code changes with TDD and RED/GREEN evidence. Use when he-plan or he-work requires TDD for a concrete behavior target.

56

Quality

66%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./Plugins/harness-engineering/fixtures/budget-archive/2026-04-21/deferred-store/skills/team_automation/he-tdd/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

75%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description establishes a clear niche around TDD-based code changes with explicit trigger conditions, making it reasonably complete and distinctive. However, it relies on internal/custom terminology ('he-plan', 'he-work') that limits natural trigger term coverage, and the specific capabilities could be more concretely enumerated (e.g., writing failing tests, implementing minimal code, verifying green state).

Suggestions

Expand trigger terms to include natural user language like 'test-driven development', 'unit tests', 'test first', 'write tests before code'.

List more specific concrete actions such as 'writes failing tests, implements minimal code to pass, verifies RED-to-GREEN transitions, and refactors safely'.

DimensionReasoningScore

Specificity

It names the domain (TDD, code changes) and mentions some actions ('build behavior-safe code changes', 'RED/GREEN evidence'), but doesn't list multiple specific concrete actions like writing tests, running tests, refactoring, or verifying test output.

2 / 3

Completeness

It answers both 'what' (build behavior-safe code changes with TDD and RED/GREEN evidence) and 'when' (when he-plan or he-work requires TDD for a concrete behavior target) with an explicit 'Use when' clause containing trigger conditions.

3 / 3

Trigger Term Quality

Includes relevant terms like 'TDD', 'RED/GREEN', and 'behavior target', but uses non-standard terms like 'he-plan' and 'he-work' which are internal/custom references unlikely to be naturally said by users. Missing common variations like 'test-driven development', 'unit tests', 'test first'.

2 / 3

Distinctiveness Conflict Risk

The combination of TDD, RED/GREEN evidence, and the specific references to 'he-plan' and 'he-work' create a clear niche that is unlikely to conflict with other skills. It targets a very specific workflow pattern.

3 / 3

Total

10

/

12

Passed

Implementation

57%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill is well-structured as a progressive disclosure entry point with excellent reference organization and clear 'Read when' signals. However, the body itself lacks concrete, executable examples of the TDD workflow it describes—no sample test code, no framework-specific commands, and no explicit validation checkpoints within the RED/GREEN cycle. The content would benefit from at least one concrete code example showing a RED-to-GREEN transition and more explicit inline verification steps.

Suggestions

Add a concrete, executable example showing a minimal RED test followed by the GREEN fix, using a specific test framework (e.g., pytest or jest) so Claude has a copy-paste-ready template.

Add explicit validation checkpoints within the Procedure steps, e.g., 'Confirm test output shows FAIL before proceeding to step 3' and 'If test does not fail, revisit the assertion before writing implementation code.'

Remove or consolidate the redundant Subagent Routing section since it's already referenced in Full Context, and trim the meta-commentary in the opening paragraph about what 'archived' means.

DimensionReasoningScore

Conciseness

The skill is moderately efficient but includes some unnecessary sections. The 'Progressive Disclosure Entry' preamble explaining what 'archived' means is meta-commentary Claude doesn't need. The Subagent Routing section repeats a reference already listed in Full Context. The Examples section provides natural language prompts rather than actionable examples, and the Philosophy section adds little operational value.

2 / 3

Actionability

The procedure provides a clear sequence but remains at a high level of abstraction—'produce a failing test first (RED), then apply the smallest fix (GREEN)' is directional rather than concrete. There are no executable code examples, no specific test framework commands (beyond the validation audit command), and no sample test code showing what a RED/GREEN cycle looks like in practice. The skill relies heavily on deferred references for actual operational detail.

2 / 3

Workflow Clarity

The 5-step procedure provides a reasonable sequence and the constraint 'Do not skip RED verification' acts as a checkpoint. However, there are no explicit validation gates between steps (e.g., 'confirm RED output before proceeding to GREEN'), no error recovery guidance if a test doesn't fail as expected, and the validation section only covers skill auditing rather than the TDD workflow itself. For a workflow involving iterative test-code cycles, the lack of inline verification checkpoints is a gap.

2 / 3

Progressive Disclosure

The skill excels at progressive disclosure with a concise overview and well-organized, clearly signaled references with 'Read when' annotations explaining when to load each reference. References are one level deep and cover distinct concerns (mocking, interface design, refactoring, etc.). The structure makes it easy to navigate to the right detailed document based on the current need.

3 / 3

Total

9

/

12

Passed

Validation

81%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation9 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

metadata_version

'metadata.version' is missing

Warning

metadata_field

'metadata' should map string keys to string values

Warning

Total

9

/

11

Passed

Repository
jscraik/Agent-Skills
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.