Execute comprehensive load and stress testing to validate API performance and scalability. Use when validating API performance under load. Trigger with phrases like "load test the API", "stress test API", or "benchmark API performance".
77
73%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Risky
Do not use without reviewing
Optimize this skill with Tessl
npx tessl skill review --optimize ./plugins/api-development/api-load-tester/skills/load-testing-apis/SKILL.mdQuality
Discovery
89%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a solid skill description with explicit trigger guidance and a clear 'Use when' clause. Its main weakness is that the capability description is somewhat high-level—it could benefit from listing more specific concrete actions (e.g., simulating concurrent users, measuring latency/throughput, generating performance reports). The trigger terms are well-chosen and naturally match what users would say.
Suggestions
Add more specific concrete actions such as 'simulate concurrent users, measure response times and throughput, identify performance bottlenecks, generate load test reports' to improve specificity.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | It names the domain (load/stress testing, API performance) and mentions some actions ('load and stress testing', 'validate API performance and scalability'), but doesn't list multiple concrete specific actions like generating reports, simulating concurrent users, measuring response times, or identifying bottlenecks. | 2 / 3 |
Completeness | Clearly answers both 'what' (execute comprehensive load and stress testing to validate API performance and scalability) and 'when' (explicit 'Use when' clause and 'Trigger with phrases like' providing concrete trigger guidance). | 3 / 3 |
Trigger Term Quality | Includes strong natural trigger terms: 'load test the API', 'stress test API', 'benchmark API performance', 'load', 'scalability', 'performance'. These are phrases users would naturally say when needing this skill. | 3 / 3 |
Distinctiveness Conflict Risk | The combination of 'load test', 'stress test', and 'benchmark API performance' creates a clear niche that is unlikely to conflict with other skills like general API testing, unit testing, or monitoring skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
57%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill provides a well-structured overview of load testing with good progressive disclosure and clear output expectations. Its main weakness is the lack of concrete, executable code examples — for a skill centered on generating test scripts, at least one complete k6 or Artillery script should be included. The workflow would also benefit from explicit validation checkpoints between load test iterations.
Suggestions
Add at least one complete, executable k6 script example (e.g., a basic ramp-up test with thresholds) so Claude has a concrete template to work from rather than just prose descriptions.
Add explicit validation checkpoints in the workflow, e.g., 'After baseline test: verify error rate < 1% and p95 < threshold before proceeding to 2x load. If thresholds are breached at baseline, stop and report findings.'
Convert the prose examples into actual k6/Artillery configuration snippets showing the exact scenario setup (stages, thresholds, request weights).
Trim prerequisites to remove explanations of well-known tools — Claude knows what Grafana, JWT tokens, and API keys are.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The content is reasonably well-structured but includes some unnecessary verbosity. The prerequisites section explains things Claude would know (what Grafana/Prometheus are, what JWT tokens are). The examples section describes scenarios in prose rather than providing executable code. Some tightening is possible throughout. | 2 / 3 |
Actionability | The instructions provide a clear sequence of steps but lack concrete, executable code examples. There are no actual k6 or Artillery script snippets — just descriptions of what to generate. For a skill about generating load test scripts, at least one complete, copy-paste-ready k6 script example would be expected. The examples section describes scenarios in prose rather than showing executable configurations. | 2 / 3 |
Workflow Clarity | The 9-step workflow is clearly sequenced and logically ordered from discovery through execution to reporting. However, there are no explicit validation checkpoints or feedback loops — step 7 says to gradually increase load but doesn't specify what to check between iterations or when to stop. There's no 'if X fails, do Y' pattern within the main workflow despite this being a process where intermediate validation is critical. | 2 / 3 |
Progressive Disclosure | Content is well-organized with clear sections (Overview, Prerequisites, Instructions, Output, Error Handling, Examples, Resources). References to external files are one level deep and clearly signaled (implementation.md, errors.md, examples.md). The main file serves as a good overview without being monolithic. | 3 / 3 |
Total | 9 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
allowed_tools_field | 'allowed-tools' contains unusual tool name(s) | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 9 / 11 Passed | |
70e9fa4
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.