performance-benchmarker

Expert performance testing and optimization specialist focused on measuring, analyzing, and improving system performance across all applications and infrastructure

Quality

21%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./testing-performance-benchmarker/skills/SKILL.md

Quality

Discovery

14%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This description reads like a resume headline rather than a functional skill description. It lacks concrete actions, specific trigger terms, any 'when to use' guidance, and is so broadly scoped that it would conflict with numerous other skills. It needs a complete rewrite with specific capabilities, tools, and explicit trigger conditions.

Suggestions

Add a 'Use when...' clause with specific trigger terms like 'load test', 'benchmark', 'latency', 'throughput', 'response time', 'profiling', or 'performance bottleneck'.

Replace vague language with concrete actions, e.g., 'Runs load tests, profiles application bottlenecks, analyzes response time metrics, and generates optimization recommendations'.

Narrow the scope to a specific domain (e.g., web application performance, database query optimization, API load testing) to reduce conflict risk with other skills.

Dimension	Reasoning	Score
Specificity	The description uses vague, abstract language like 'measuring, analyzing, and improving system performance' without listing any concrete actions. Terms like 'expert' and 'specialist' are buzzwords that don't describe specific capabilities.	1 / 3
Completeness	The 'what' is vaguely stated and the 'when' is entirely missing. There is no 'Use when...' clause or equivalent explicit trigger guidance, and the description reads more like a job title than actionable skill documentation.	1 / 3
Trigger Term Quality	Contains some relevant keywords like 'performance testing', 'optimization', and 'infrastructure', but lacks specific natural terms users would say such as 'load test', 'latency', 'throughput', 'benchmark', 'profiling', 'response time', or specific tool names.	2 / 3
Distinctiveness Conflict Risk	The phrase 'across all applications and infrastructure' is extremely broad and would conflict with many other skills related to debugging, monitoring, DevOps, or general code optimization. There is no clear niche defined.	1 / 3
	Total	5 / 12 Passed

Implementation

27%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill reads more like a persona/role-play prompt than an actionable skill document. It spends most of its token budget on identity description, communication style, success metrics, and abstract capability lists rather than concrete, executable guidance. The one strong element—the k6 code example—is buried in an overwhelming amount of verbose, non-actionable content that Claude already knows.

Suggestions

Remove all persona/identity/communication style sections and focus exclusively on actionable instructions—Claude doesn't need to be told it's 'analytical' or 'metrics-focused' to perform well.

Extract the k6 example, report template, and advanced capabilities into separate referenced files (e.g., EXAMPLES.md, REPORT_TEMPLATE.md) and keep SKILL.md as a concise overview with clear navigation.

Replace abstract instructions like 'identify bottlenecks through systematic analysis' with specific commands, tool invocations, and decision trees (e.g., 'Run `k6 run --out json=results.json script.js`, then check if p95 > threshold; if so, profile with...').

Add explicit validation checkpoints and feedback loops to the workflow, such as 'If error rate exceeds 1%, reduce VU count and re-run to isolate the failing endpoint before proceeding.'

Dimension	Reasoning	Score
Conciseness	Extremely verbose with extensive sections describing personality, identity, communication style, success metrics, and learning/memory that add no actionable value. Much of the content describes what the agent 'is' rather than what to do. The document is heavily padded with emoji headers, motivational framing, and concepts Claude already knows (e.g., what Core Web Vitals are, what load testing means).	1 / 3
Actionability	The k6 code example is concrete and executable, which is a strength. However, the vast majority of the document is abstract guidance ('identify bottlenecks through systematic analysis,' 'implement advanced frontend performance techniques') without specific commands, configurations, or executable steps. The deliverable template is a fill-in-the-blank report rather than actionable instructions.	2 / 3
Workflow Clarity	The 4-step workflow process provides a reasonable sequence but lacks explicit validation checkpoints and feedback loops. Steps like 'Execute comprehensive performance testing' and 'Identify bottlenecks through systematic analysis' are too abstract. There's no clear guidance on what to do when tests fail or how to verify that optimizations actually worked beyond a vague 'before/after comparisons' mention.	2 / 3
Progressive Disclosure	The document is a monolithic wall of text at ~250+ lines with no references to external files for detailed content. Everything is inlined—the k6 example, the report template, the advanced capabilities list, the communication style guide. The final line references 'core training' which is vague and unhelpful. Content like the full HTML report generator and the detailed report template should be in separate referenced files.	1 / 3
	Total	6 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
frontmatter_unknown_keys	Unknown frontmatter key(s) found; consider removing or moving to metadata	Warning

	Total	10 / 11 Passed

Repository: OpenRoster-ai/awesome-openroster
Commit: 09aef5d

Reviewed: about 1 month ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.