test-failure-mindset

This skill should be used when encountering failing tests or when the user asks about "test failure analysis", "debugging tests", "why tests fail", or needs to set a balanced investigative approach for test failures. Establishes mindset that treats test failures as valuable signals requiring investigation, not automatic dismissal.

Quality

62%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./plugins/development-harness/skills/testing/test-failure-mindset/SKILL.md

Quality

Discovery

54%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description has strong trigger term coverage but fails to communicate what the skill actually does beyond establishing a 'mindset'. It reads more like a philosophy statement than a capability description. The lack of concrete actions makes it difficult to understand when this skill would provide value over general debugging approaches.

Suggestions

Replace abstract language like 'establishes mindset' with concrete actions such as 'analyzes test output', 'identifies failure patterns', 'traces root causes', or 'suggests targeted fixes'.

Add specific capabilities that differentiate this from general debugging - e.g., 'parses test framework output', 'compares expected vs actual values', 'identifies flaky test patterns'.

Dimension	Reasoning	Score
Specificity	The description lacks concrete actions. It mentions 'establishes mindset' and 'treats test failures as valuable signals' but doesn't describe what the skill actually does - no specific actions like 'analyzes stack traces', 'identifies root causes', or 'suggests fixes'.	1 / 3
Completeness	Has explicit 'Use when' guidance with trigger terms, but the 'what' portion is weak - it only describes a mindset/approach rather than concrete capabilities. The skill's actual functionality remains unclear.	2 / 3
Trigger Term Quality	Good coverage of natural trigger terms: 'failing tests', 'test failure analysis', 'debugging tests', 'why tests fail'. These are phrases users would naturally say when encountering test problems.	3 / 3
Distinctiveness Conflict Risk	The test-specific focus provides some distinctiveness, but 'debugging tests' could overlap with general debugging skills. The mindset-focused framing ('establishes mindset', 'balanced investigative approach') is unusual but doesn't clearly carve out a unique niche.	2 / 3
	Total	8 / 12 Passed

Implementation

70%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-structured mindset skill that establishes a balanced investigative approach for test failures. Its strengths are clear workflow sequencing and good organization with tables and examples. The main weakness is lack of concrete, executable guidance - it tells Claude what to think but not specific commands or tools to use during investigation.

Suggestions

Add concrete git commands for checking test history (e.g., `git blame path/to/test.py`, `git log --oneline -10 -- path/to/test.py`)

Include specific debugging techniques or commands relevant to common test frameworks (e.g., pytest verbose output, running single tests in isolation)

Consolidate 'Red Flags' and 'Good Practices' sections to reduce redundancy and improve token efficiency

Dimension	Reasoning	Score
Conciseness	The content is reasonably efficient but includes some redundancy (e.g., 'Red Flags' and 'Good Practices' sections overlap conceptually with earlier guidance). The tables and examples are helpful but the overall document could be tightened.	2 / 3
Actionability	Provides a clear mental framework and investigation steps, but lacks concrete executable commands or code examples. The guidance is procedural but abstract - no specific debugging commands, git commands for history checking, or tool-specific instructions.	2 / 3
Workflow Clarity	The 5-step investigation protocol is clearly sequenced with explicit decision points. The decision table in step 4 provides clear branching logic, and the 'unclear' case appropriately suggests seeking clarification before proceeding.	3 / 3
Progressive Disclosure	For a mindset/approach skill of this length (~80 lines), the structure is appropriate. Content is well-organized with clear sections, tables for quick reference, and references to related skills at the end for deeper dives.	3 / 3
	Total	10 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
frontmatter_unknown_keys	Unknown frontmatter key(s) found; consider removing or moving to metadata	Warning

	Total	10 / 11 Passed

Repository: Jamie-BitFlight/claude_skills
Commit: fd243f9

Reviewed: about 1 month ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.