Agent skill for v3-performance-engineer - invoke with $agent-v3-performance-engineer
35
0%
Does it follow best practices?
Impact
96%
3.20xAverage score across 3 eval scenarios
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./.agents/skills/agent-v3-performance-engineer/SKILL.mdQuality
Discovery
0%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is an extremely weak description that provides virtually no useful information for skill selection. It only contains an invocation command and a generic label, failing on every dimension. Claude would have no basis to select this skill appropriately from a pool of available skills.
Suggestions
Add concrete actions describing what the skill does, e.g., 'Profiles application performance, identifies bottlenecks, optimizes query execution, and analyzes resource utilization metrics.'
Add an explicit 'Use when...' clause with natural trigger terms, e.g., 'Use when the user mentions performance optimization, slow queries, latency issues, load testing, benchmarking, or resource profiling.'
Specify the domain or technology stack to distinguish this from other performance-related skills, e.g., 'for v3 API endpoints' or 'for Python/Node.js applications.'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description contains no concrete actions whatsoever. It only states it's an 'agent skill' with an invocation command, providing zero information about what the skill actually does. | 1 / 3 |
Completeness | Neither 'what does this do' nor 'when should Claude use it' is answered. The description only provides an invocation command, which is not a substitute for explaining capabilities or trigger conditions. | 1 / 3 |
Trigger Term Quality | The only potentially relevant term is 'performance-engineer' embedded in the agent name, but there are no natural keywords a user would say. No terms like 'optimize', 'benchmark', 'latency', 'profiling', or any domain-specific triggers are present. | 1 / 3 |
Distinctiveness Conflict Risk | The description is so vague that it could conflict with any performance-related skill. Without specifying what kind of performance engineering (web, database, system, application), it provides no clear niche. | 1 / 3 |
Total | 4 / 12 Passed |
Implementation
0%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill is a verbose, aspirational design document masquerading as an actionable skill. It contains hundreds of lines of non-executable pseudocode referencing fictional APIs, repeats performance targets multiple times across different sections, and provides no concrete, actionable steps for Claude to follow. The decorative ASCII boxes, lengthy TypeScript class stubs, and coordination notes add significant token cost with zero practical value.
Suggestions
Replace fictional TypeScript benchmark classes with actual executable commands or scripts that Claude can run (e.g., real npm benchmark commands, actual profiling tool invocations).
Define a clear step-by-step workflow: what to measure first, how to measure it, what tools to use, what thresholds trigger action, and what actions to take.
Eliminate redundancy — state performance targets once in a concise table rather than repeating them in hooks, ASCII boxes, code comments, and checklists.
Extract detailed benchmark implementations into separate referenced files and keep SKILL.md as a concise overview with clear navigation to those files.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose at 300+ lines. The ASCII box diagrams are decorative and redundant with the text. The TypeScript benchmark classes are lengthy pseudocode that Claude cannot execute — they describe hypothetical APIs (this.sona.adapt, this.flashAttention, etc.) that don't exist. The performance targets are repeated multiple times (in hooks, mission section, target matrix, and checklist). Massive token waste. | 1 / 3 |
Actionability | Despite containing extensive TypeScript code, none of it is executable — it references fictional APIs (this.agentDBMemory.hnswSearch, this.sona.adapt, this.standardAttention, etc.) with no real implementations. There are no concrete commands to run, no actual benchmark scripts to execute, no real tool invocations. The entire skill describes aspirational targets rather than instructing how to achieve them. | 1 / 3 |
Workflow Clarity | There is no clear workflow or sequence of steps. The content is organized as a collection of benchmark class definitions and checklists, but there's no guidance on what to do first, how to proceed, or what validation checkpoints to follow. The checklist at the end is a list of goals, not a workflow. No feedback loops or error recovery paths are defined. | 1 / 3 |
Progressive Disclosure | Monolithic wall of content with no references to external files. Everything is inlined in a single massive document. The coordination section references other agents but provides no links or file references. No bundle files exist to support progressive disclosure. Content that could be split (benchmark implementations, monitoring dashboard, regression detection) is all crammed into one file. | 1 / 3 |
Total | 4 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
d29d87f
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.