CtrlK
CommunityDocumentationLog inGet started
Tessl Logo

detecting-performance-regressions

tessl i github:jeremylongshore/claude-code-plugins-plus-skills --skill detecting-performance-regressions

Automatically detect performance regressions in CI/CD pipelines by comparing metrics against baselines. Use when validating builds or analyzing performance trends. Trigger with phrases like "detect performance regression", "compare performance metrics", or "analyze performance degradation".

51%

Overall

SKILL.md
Review
Evals

Validation

81%
CriteriaDescriptionResult

allowed_tools_field

'allowed-tools' contains unusual tool name(s)

Warning

metadata_version

'metadata' field is not a dictionary

Warning

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

13

/

16

Passed

Implementation

7%

This skill content is highly verbose and abstract, explaining concepts Claude already understands while failing to provide any concrete, executable guidance. It describes what the skill does conceptually but never shows how to actually detect regressions - no code, no specific tools, no file formats, no actual commands. The content reads like marketing copy rather than actionable instructions.

Suggestions

Replace abstract descriptions with concrete code examples showing how to load baseline data, compare metrics, and detect regressions (e.g., Python scripts with pandas/scipy for statistical analysis)

Specify the exact file formats for baselines and metrics (JSON schema, CSV structure) and provide example data

Add explicit validation steps with actual commands: 'Run `python validate_baseline.py` - if it returns errors, fix X before proceeding'

Remove redundant sections (Overview, How It Works, When to Use all repeat similar information) and consolidate into a lean quick-start with executable examples

DimensionReasoningScore

Conciseness

Extremely verbose with extensive padding. Explains obvious concepts Claude already knows (what CI/CD is, what performance metrics are), repeats information across sections (Overview, How It Works, When to Use all say similar things), and includes generic filler content throughout.

1 / 3

Actionability

No concrete code, commands, or executable guidance anywhere. Everything is abstract description ('gather performance metrics', 'apply statistical analysis') without specifying actual tools, scripts, file formats, or commands to run.

1 / 3

Workflow Clarity

Instructions section lists steps but they are vague and lack validation checkpoints. No feedback loops for error recovery, no specific commands, and the Error Handling section just lists things to 'verify' or 'check' without explaining how.

1 / 3

Progressive Disclosure

Content is organized into sections with headers, but it's a monolithic document with no references to external files. The Prerequisites mention a path but don't link to any detailed documentation. Content that could be split (examples, error handling) is all inline.

2 / 3

Total

5

/

12

Passed

Activation

90%

This is a well-structured description that clearly communicates when and why to use the skill, with explicit trigger phrases that aid skill selection. The main weakness is that the capabilities could be more specific about what concrete actions the skill performs beyond 'detect' and 'compare'. The description uses proper third-person voice throughout.

Suggestions

Add more specific concrete actions such as 'generate regression reports', 'set performance thresholds', or 'track latency/throughput/memory metrics' to improve specificity.

DimensionReasoningScore

Specificity

Names the domain (CI/CD pipelines, performance) and some actions (detect regressions, compare metrics, analyze trends), but lacks specific concrete actions like 'generate regression reports', 'set threshold alerts', or 'track specific metrics like latency/throughput'.

2 / 3

Completeness

Clearly answers both what (detect performance regressions by comparing metrics against baselines) and when (validating builds, analyzing performance trends) with explicit trigger phrases provided.

3 / 3

Trigger Term Quality

Includes good natural trigger phrases users would say: 'detect performance regression', 'compare performance metrics', 'analyze performance degradation', plus contextual terms like 'CI/CD pipelines', 'validating builds', and 'performance trends'.

3 / 3

Distinctiveness Conflict Risk

Clear niche focused specifically on performance regression detection in CI/CD context with distinct triggers like 'regression', 'baselines', 'degradation' - unlikely to conflict with general monitoring or testing skills.

3 / 3

Total

11

/

12

Passed

Reviewed

Table of Contents

ValidationImplementationActivation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.