Expert performance testing and optimization specialist focused on measuring, analyzing, and improving system performance across all applications and infrastructure
36
21%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./testing-performance-benchmarker/skills/SKILL.mdQuality
Discovery
14%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This description reads like a resume headline rather than a functional skill description. It lacks concrete actions, specific trigger terms, any 'when to use' guidance, and is so broadly scoped that it would conflict with numerous other skills. It needs a complete rewrite with specific capabilities, tools, and explicit trigger conditions.
Suggestions
Add a 'Use when...' clause with specific trigger terms like 'load test', 'benchmark', 'latency', 'throughput', 'response time', 'profiling', or 'performance bottleneck'.
Replace vague language with concrete actions, e.g., 'Runs load tests, profiles application bottlenecks, analyzes response time metrics, and generates optimization recommendations'.
Narrow the scope to a specific domain (e.g., web application performance, database query optimization, API load testing) to reduce conflict risk with other skills.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description uses vague, abstract language like 'measuring, analyzing, and improving system performance' without listing any concrete actions. Terms like 'expert' and 'specialist' are buzzwords that don't describe specific capabilities. | 1 / 3 |
Completeness | The 'what' is vaguely stated and the 'when' is entirely missing. There is no 'Use when...' clause or equivalent explicit trigger guidance, and the description reads more like a job title than actionable skill documentation. | 1 / 3 |
Trigger Term Quality | Contains some relevant keywords like 'performance testing', 'optimization', and 'infrastructure', but lacks specific natural terms users would say such as 'load test', 'latency', 'throughput', 'benchmark', 'profiling', 'response time', or specific tool names. | 2 / 3 |
Distinctiveness Conflict Risk | The phrase 'across all applications and infrastructure' is extremely broad and would conflict with many other skills related to debugging, monitoring, DevOps, or general code optimization. There is no clear niche defined. | 1 / 3 |
Total | 5 / 12 Passed |
Implementation
27%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill reads more like a persona/role-play prompt than an actionable skill document. It spends most of its token budget on identity description, communication style, success metrics, and abstract capability lists rather than concrete, executable guidance. The one strong element—the k6 code example—is buried in an overwhelming amount of verbose, non-actionable content that Claude already knows.
Suggestions
Remove all persona/identity/communication style sections and focus exclusively on actionable instructions—Claude doesn't need to be told it's 'analytical' or 'metrics-focused' to perform well.
Extract the k6 example, report template, and advanced capabilities into separate referenced files (e.g., EXAMPLES.md, REPORT_TEMPLATE.md) and keep SKILL.md as a concise overview with clear navigation.
Replace abstract instructions like 'identify bottlenecks through systematic analysis' with specific commands, tool invocations, and decision trees (e.g., 'Run `k6 run --out json=results.json script.js`, then check if p95 > threshold; if so, profile with...').
Add explicit validation checkpoints and feedback loops to the workflow, such as 'If error rate exceeds 1%, reduce VU count and re-run to isolate the failing endpoint before proceeding.'
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose with extensive sections describing personality, identity, communication style, success metrics, and learning/memory that add no actionable value. Much of the content describes what the agent 'is' rather than what to do. The document is heavily padded with emoji headers, motivational framing, and concepts Claude already knows (e.g., what Core Web Vitals are, what load testing means). | 1 / 3 |
Actionability | The k6 code example is concrete and executable, which is a strength. However, the vast majority of the document is abstract guidance ('identify bottlenecks through systematic analysis,' 'implement advanced frontend performance techniques') without specific commands, configurations, or executable steps. The deliverable template is a fill-in-the-blank report rather than actionable instructions. | 2 / 3 |
Workflow Clarity | The 4-step workflow process provides a reasonable sequence but lacks explicit validation checkpoints and feedback loops. Steps like 'Execute comprehensive performance testing' and 'Identify bottlenecks through systematic analysis' are too abstract. There's no clear guidance on what to do when tests fail or how to verify that optimizations actually worked beyond a vague 'before/after comparisons' mention. | 2 / 3 |
Progressive Disclosure | The document is a monolithic wall of text at ~250+ lines with no references to external files for detailed content. Everything is inlined—the k6 example, the report template, the advanced capabilities list, the communication style guide. The final line references 'core training' which is vague and unhelpful. Content like the full HTML report generator and the detailed report template should be in separate referenced files. | 1 / 3 |
Total | 6 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 10 / 11 Passed | |
09aef5d
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.