tessl i github:jeremylongshore/claude-code-plugins-plus-skills --skill detecting-performance-regressionsAutomatically detect performance regressions in CI/CD pipelines by comparing metrics against baselines. Use when validating builds or analyzing performance trends. Trigger with phrases like "detect performance regression", "compare performance metrics", or "analyze performance degradation".
Validation
81%| Criteria | Description | Result |
|---|---|---|
allowed_tools_field | 'allowed-tools' contains unusual tool name(s) | Warning |
metadata_version | 'metadata' field is not a dictionary | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 13 / 16 Passed | |
Implementation
7%This skill content is highly verbose and abstract, explaining concepts Claude already understands while failing to provide any concrete, executable guidance. It describes what the skill does conceptually but never shows how to actually detect regressions - no code, no specific tools, no file formats, no actual commands. The content reads like marketing copy rather than actionable instructions.
Suggestions
Replace abstract descriptions with concrete code examples showing how to load baseline data, compare metrics, and detect regressions (e.g., Python scripts with pandas/scipy for statistical analysis)
Specify the exact file formats for baselines and metrics (JSON schema, CSV structure) and provide example data
Add explicit validation steps with actual commands: 'Run `python validate_baseline.py` - if it returns errors, fix X before proceeding'
Remove redundant sections (Overview, How It Works, When to Use all repeat similar information) and consolidate into a lean quick-start with executable examples
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose with extensive padding. Explains obvious concepts Claude already knows (what CI/CD is, what performance metrics are), repeats information across sections (Overview, How It Works, When to Use all say similar things), and includes generic filler content throughout. | 1 / 3 |
Actionability | No concrete code, commands, or executable guidance anywhere. Everything is abstract description ('gather performance metrics', 'apply statistical analysis') without specifying actual tools, scripts, file formats, or commands to run. | 1 / 3 |
Workflow Clarity | Instructions section lists steps but they are vague and lack validation checkpoints. No feedback loops for error recovery, no specific commands, and the Error Handling section just lists things to 'verify' or 'check' without explaining how. | 1 / 3 |
Progressive Disclosure | Content is organized into sections with headers, but it's a monolithic document with no references to external files. The Prerequisites mention a path but don't link to any detailed documentation. Content that could be split (examples, error handling) is all inline. | 2 / 3 |
Total | 5 / 12 Passed |
Activation
90%This is a well-structured description that clearly communicates when and why to use the skill, with explicit trigger phrases that aid skill selection. The main weakness is that the capabilities could be more specific about what concrete actions the skill performs beyond 'detect' and 'compare'. The description uses proper third-person voice throughout.
Suggestions
Add more specific concrete actions such as 'generate regression reports', 'set performance thresholds', or 'track latency/throughput/memory metrics' to improve specificity.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (CI/CD pipelines, performance) and some actions (detect regressions, compare metrics, analyze trends), but lacks specific concrete actions like 'generate regression reports', 'set threshold alerts', or 'track specific metrics like latency/throughput'. | 2 / 3 |
Completeness | Clearly answers both what (detect performance regressions by comparing metrics against baselines) and when (validating builds, analyzing performance trends) with explicit trigger phrases provided. | 3 / 3 |
Trigger Term Quality | Includes good natural trigger phrases users would say: 'detect performance regression', 'compare performance metrics', 'analyze performance degradation', plus contextual terms like 'CI/CD pipelines', 'validating builds', and 'performance trends'. | 3 / 3 |
Distinctiveness Conflict Risk | Clear niche focused specifically on performance regression detection in CI/CD context with distinct triggers like 'regression', 'baselines', 'degradation' - unlikely to conflict with general monitoring or testing skills. | 3 / 3 |
Total | 11 / 12 Passed |
Reviewed
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.