This skill enables Claude to track and run regression tests, ensuring new changes don't break existing functionality. It is triggered when the user asks to "track regression", "run regression tests", or uses the shortcut "reg". The skill helps in maintaining code stability by identifying critical tests, automating their execution, and analyzing the impact of changes. It also provides insights into test history and identifies flaky tests. The skill uses the `regression-test-tracker` plugin.
Install with Tessl CLI
npx tessl i github:jeremylongshore/claude-code-plugins-plus-skills --skill tracking-regression-tests87
Does it follow best practices?
If you maintain this skill, you can automatically optimize it using the tessl CLI to improve its score:
npx tessl skill review --optimize ./path/to/skillEvaluation — 92%
↑ 1.39xAgent success when using this skill
Validation for skill structure
Plugin usage and mark flag
Plugin reference
0%
100%
Mark flag syntax
0%
100%
Confirmation language
90%
60%
Critical test selection
100%
100%
Run script uses plugin
0%
100%
Runbook mark workflow
20%
100%
Runbook run workflow
100%
100%
Deployment frequency guidance
100%
100%
Without context: $0.4298 · 2m 54s · 22 turns · 23 in / 6,590 out tokens
With context: $0.5349 · 3m 12s · 28 turns · 282 in / 7,214 out tokens
Flaky test detection and failure analysis
Failures highlighted
100%
100%
Flaky test identified
100%
100%
Consistent failure identified
100%
100%
Root cause for reorder trigger
100%
100%
Root cause for concurrent update
100%
100%
Root cause for low stock alert
100%
100%
Prioritized action list
80%
100%
Without context: $0.3098 · 1m 28s · 14 turns · 15 in / 5,123 out tokens
With context: $0.3736 · 2m 52s · 17 turns · 17 in / 5,539 out tokens
CI/CD integration and deployment frequency
Plugin in CI script
0%
0%
Pre-deployment gate
100%
100%
Deployment blocking
100%
100%
Critical test selection
100%
100%
Mark flag documented
0%
33%
Run frequency guidance
100%
100%
Flaky test mention
100%
100%
Results interpretation
100%
100%
Without context: $0.2896 · 1m 12s · 18 turns · 19 in / 4,212 out tokens
With context: $0.4280 · 2m 32s · 23 turns · 23 in / 5,796 out tokens
Pre-launch regression baseline setup
Plugin invoked for marking
0%
100%
Mark flag used
0%
100%
Confirmation per test
0%
0%
Critical path tests selected
100%
100%
Change-risk tests included
100%
100%
Plugin used for run
0%
100%
Pre-deployment frequency
100%
100%
Results include failures
100%
100%
Flaky test awareness
100%
100%
Runbook completeness
100%
100%
Without context: $0.3901 · 3m 16s · 14 turns · 15 in / 7,774 out tokens
With context: $0.5657 · 3m 50s · 25 turns · 279 in / 9,306 out tokens
Change-driven regression run and analysis
Change-affected tests identified
100%
100%
Plugin used for marking
0%
100%
Mark flag syntax correct
0%
100%
Plugin used for run
0%
100%
Failures highlighted in report
100%
100%
Root cause per failure
100%
100%
Flaky tests flagged
100%
100%
Confirmation of addition
37%
62%
Prioritized action list
100%
100%
Critical tests not omitted
100%
100%
Without context: $0.4312 · 2m 54s · 19 turns · 20 in / 6,692 out tokens
With context: $0.4678 · 3m 28s · 21 turns · 102 in / 6,816 out tokens
Flaky test triage and suite refinement
Flaky tests identified
100%
100%
Consistent failures identified
100%
100%
Stable tests identified
100%
100%
Root cause per failing test
100%
100%
Flaky root cause or hypothesis
100%
100%
Plugin used for refinement
0%
100%
Mark flag for additions
0%
100%
Critical tests retained
100%
100%
Prioritized recommendations
100%
100%
Confirmation output
100%
50%
Without context: $0.3939 · 2m 54s · 16 turns · 17 in / 7,153 out tokens
With context: $0.3922 · 1m 34s · 20 turns · 278 in / 5,491 out tokens
Test history insights and trend analysis
Flaky tests identified
100%
100%
Consistently failing identified
100%
100%
Stable tests identified
100%
100%
Trend patterns reported
100%
100%
Root cause for flaky tests
100%
100%
Root cause for consistent failure
100%
100%
Plugin used for suite update
0%
100%
Mark flag for additions
0%
100%
Prioritized action recommendations
100%
100%
Suite update confirmation
100%
100%
Without context: $0.4440 · 3m 54s · 17 turns · 18 in / 8,173 out tokens
With context: $0.4759 · 3m 29s · 23 turns · 21 in / 7,176 out tokens
Emergency hotfix regression verification
Plugin used for marking
0%
50%
Mark flag syntax
0%
80%
Affected tests identified
100%
100%
Plugin used for run
0%
60%
Pre-deployment urgency
100%
100%
Confirmation per test added
100%
100%
Failures highlighted
75%
62%
Root cause per failure
100%
90%
Flaky test handling
0%
37%
Critical path tests selected
100%
100%
Without context: $0.4057 · 1m 38s · 20 turns · 21 in / 6,130 out tokens
With context: $0.4300 · 1m 36s · 23 turns · 103 in / 5,637 out tokens
Critical test selection for new module regression suite
Plugin used for marking
0%
100%
Mark flag per test
0%
100%
Critical functionality tests selected
100%
100%
Change-likely tests included
100%
100%
Confirmation per test
100%
100%
Selection rationale documented
100%
100%
Low-value tests excluded
100%
100%
Run frequency guidance
25%
100%
Plugin run step documented
0%
100%
Flaky test awareness
100%
100%
Without context: $0.4046 · 3m 13s · 21 turns · 21 in / 6,562 out tokens
With context: $0.3282 · 1m 23s · 19 turns · 1,934 in / 4,298 out tokens
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.