test-results-analyzer

Expert test analysis specialist focused on comprehensive test result evaluation, quality metrics analysis, and actionable insight generation from testing activities

Quality

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./testing-test-results-analyzer/skills/SKILL.md

Quality

Discovery

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This description is a collection of vague buzzwords ('comprehensive', 'actionable insight generation', 'expert specialist') without any concrete actions, natural trigger terms, or explicit usage guidance. It reads like a job title rather than a functional skill description, making it nearly impossible for Claude to reliably select this skill over others.

Suggestions

Replace abstract phrases with concrete actions, e.g., 'Parses test output logs, identifies failing tests, calculates pass/fail rates, detects flaky tests, and summarizes coverage gaps.'

Add an explicit 'Use when...' clause with natural trigger terms, e.g., 'Use when the user shares test results, CI/CD output, pytest logs, JUnit reports, test coverage data, or asks about test failures.'

Remove filler words like 'expert', 'comprehensive', and 'actionable' that add no discriminative value for skill selection.

Dimension	Reasoning	Score
Specificity	The description uses vague, abstract language like 'comprehensive test result evaluation', 'quality metrics analysis', and 'actionable insight generation' without listing any concrete actions. These are buzzwords rather than specific capabilities.	1 / 3
Completeness	The description vaguely addresses 'what' with abstract language but completely lacks any 'when' clause or explicit trigger guidance. There is no 'Use when...' or equivalent, and the 'what' is too vague to be useful.	1 / 3
Trigger Term Quality	The description lacks natural keywords a user would actually say. Terms like 'comprehensive test result evaluation' and 'actionable insight generation' are corporate jargon, not phrases users would type. Missing natural terms like 'test failures', 'test report', 'flaky tests', 'coverage', 'CI results', etc.	1 / 3
Distinctiveness Conflict Risk	The description is extremely generic and could overlap with any skill related to testing, quality assurance, data analysis, or metrics. 'Test analysis' and 'quality metrics' are broad enough to conflict with many other skills.	1 / 3
	Total	4 / 12 Passed

Implementation

12%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill is a persona/role-play document rather than an actionable skill. It spends most of its token budget on identity descriptions, communication style guidelines, success metrics, and non-executable pseudocode stubs. The content explains concepts Claude already understands (statistical analysis, ML, test coverage) without providing any concrete, executable implementations or specific tool configurations that would actually help Claude perform test analysis.

Suggestions

Replace the stub-heavy Python class with 1-2 small, fully executable code examples that demonstrate a specific analysis task (e.g., parsing a JUnit XML file and computing pass rates).

Remove persona/identity sections (Identity & Memory, Communication Style, Learning & Memory, Success Metrics) which consume tokens without adding actionable guidance.

Add concrete validation checkpoints to the workflow, such as 'verify data completeness before analysis' with specific checks, and 'validate statistical significance before reporting' with threshold values.

Split advanced capabilities and the report template into separate referenced files, keeping SKILL.md as a concise overview with clear pointers.

Dimension	Reasoning	Score
Conciseness	Extremely verbose at ~250+ lines. Extensively explains concepts Claude already knows (what statistical analysis is, what test coverage means, what ML models do). The 'personality' framing, success metrics, learning & memory sections, and communication style guidelines are all padding that don't provide actionable value. The code example is largely pseudocode with stub methods that don't execute.	1 / 3
Actionability	Despite the lengthy code block, none of it is executable—every meaningful method is a stub (e.g., _assess_file_risk, _categorize_failure, _analyze_failure_trends are all undefined). The skill describes what to do at a high level but never provides concrete, copy-paste-ready implementations. The report template is a skeleton with placeholders rather than actionable guidance.	1 / 3
Workflow Clarity	The 4-step workflow process is listed with a reasonable sequence (data collection → analysis → risk assessment → reporting), but lacks validation checkpoints, error handling, or feedback loops. There's no guidance on what to do when data is incomplete, when models underperform, or when analysis reveals ambiguous results.	2 / 3
Progressive Disclosure	The entire skill is a monolithic wall of text with no references to external files or resources. All content—from code examples to report templates to advanced capabilities—is inlined in a single document. The final line references 'core training' which is meaningless. No navigation structure or content splitting is present.	1 / 3
	Total	5 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
frontmatter_unknown_keys	Unknown frontmatter key(s) found; consider removing or moving to metadata	Warning

	Total	10 / 11 Passed

Repository: OpenRoster-ai/awesome-openroster
Commit: 09aef5d

Reviewed: about 1 month ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.