CtrlK
BlogDocsLog inGet started
Tessl Logo

detecting-performance-regressions

Automatically detect performance regressions in CI/CD pipelines by comparing metrics against baselines. Use when validating builds or analyzing performance trends. Trigger with phrases like "detect performance regression", "compare performance metrics", or "analyze performance degradation".

54

Quality

44%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./plugins/performance/performance-regression-detector/skills/detecting-performance-regressions/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

89%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a solid description that clearly communicates both what the skill does and when to use it, with explicit trigger phrases. Its main weakness is that the capability description could be more specific about concrete actions (e.g., types of metrics analyzed, output formats, threshold configuration). Overall it performs well across most dimensions.

Suggestions

Add more specific concrete actions such as 'generate regression reports', 'flag metrics exceeding thresholds', or 'compare latency/throughput/error rates' to improve specificity.

DimensionReasoningScore

Specificity

The description names the domain (CI/CD pipelines, performance regressions) and mentions actions like 'detect', 'compare metrics against baselines', and 'analyze performance trends', but it doesn't list multiple concrete specific actions (e.g., what metrics, what output formats, what specific comparisons). It stays somewhat high-level.

2 / 3

Completeness

Clearly answers both 'what' (detect performance regressions by comparing metrics against baselines) and 'when' (validating builds, analyzing performance trends) with explicit trigger phrases provided.

3 / 3

Trigger Term Quality

Includes strong natural trigger terms: 'detect performance regression', 'compare performance metrics', 'analyze performance degradation', 'CI/CD pipelines', 'validating builds', 'performance trends'. These are phrases users would naturally use when needing this skill.

3 / 3

Distinctiveness Conflict Risk

The description carves out a clear niche around performance regression detection in CI/CD pipelines with distinct trigger phrases. It is unlikely to conflict with general monitoring, testing, or other CI/CD skills due to the specific focus on performance regression and baseline comparison.

3 / 3

Total

11

/

12

Passed

Implementation

0%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill is almost entirely abstract description with no actionable content. It reads like a marketing overview rather than an operational guide—there are no code examples, no specific tools or libraries, no data format specifications, no threshold values, and no concrete commands. Claude would not be able to execute any meaningful performance regression detection based on this content alone.

Suggestions

Replace abstract descriptions with executable code examples showing how to load baseline data, compute statistical comparisons (e.g., using scipy.stats for t-tests or z-scores), and generate regression reports with specific output formats.

Define concrete data formats for baselines and metrics (e.g., JSON schema or CSV structure) so Claude knows exactly what to read and write.

Add explicit validation checkpoints in the workflow, such as verifying baseline data exists and is valid before proceeding, and checking that statistical results are meaningful before generating reports.

Remove sections that explain concepts Claude already knows (Overview, How It Works, When to Use, Best Practices, Integration, Resources) and replace with a lean quick-start section with copy-paste-ready code.

DimensionReasoningScore

Conciseness

Extremely verbose with extensive padding. The 'Overview' restates the title, 'How It Works' is vague hand-waving, 'When to Use This Skill' repeats the description, and sections like 'Integration', 'Resources', and 'Best Practices' contain generic platitudes Claude already knows. Nearly every section could be eliminated or drastically reduced.

1 / 3

Actionability

No concrete code, commands, scripts, or executable examples anywhere. The 'Examples' section describes what the skill 'will do' in abstract terms rather than showing how. The 'Instructions' are vague steps like 'Apply statistical analysis' with no specifics on which tools, libraries, commands, thresholds, or data formats to use.

1 / 3

Workflow Clarity

The 6-step 'Instructions' section lists abstract steps with no concrete commands, no validation checkpoints, no feedback loops, and no error recovery between steps. For a multi-step process involving statistical analysis and CI/CD integration, this lacks any meaningful operational detail.

1 / 3

Progressive Disclosure

Monolithic wall of text with no references to external files for detailed content. Mentions '${CLAUDE_SKILL_DIR}/performance/baselines/' but doesn't link to any supporting documentation. Multiple sections that could be separate files (statistical methods, threshold configuration, baseline format) are either absent or inlined as vague bullet points.

1 / 3

Total

4

/

12

Passed

Validation

81%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation9 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

allowed_tools_field

'allowed-tools' contains unusual tool name(s)

Warning

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

9

/

11

Passed

Repository
jeremylongshore/claude-code-plugins-plus-skills
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.