Applies the SZZ algorithm to VCS history to identify which commits introduced bugs by correlating bug-fix commits with earlier changes. Use when mining a repository for bug-introducing commits, when building a defect-prediction dataset, or when the user asks which commit introduced a given fixed bug.
Install with Tessl CLI
npx tessl i github:santosomar/general-secure-coding-agent-skills --skill szz-bug-identifier97
Quality
96%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
SZZ (Śliwerski, Zimmermann, Zeller, 2005) answers: given a bug-fix commit, which earlier commit introduced the bug? It works by blaming the lines the fix touched.
Fixes #1234, BUG-567) or a message keyword (fix, bug, patch).git blame <fix>^ -- <file> on each modified line to find the commit that last touched it before the fix.That's the whole algorithm. The rest is noise filtering.
| Noise source | Why it's wrong | Filter |
|---|---|---|
| Whitespace / formatting changes | Blame hits a prettier run, not the real introducer | git blame -w; ignore commits that only touch formatting |
| Comment-only changes | The fix edited a comment too — that line is not the bug | Strip comment lines before blaming |
| Large refactor commits | Every line blames to the Great Refactor of 2019 | git blame --ignore-rev with a curated ignore-list |
| The line was added by the fix | No blame target — added lines didn't exist before | Only blame deleted/modified lines, not added |
| Bug predates the repo | Blame hits the initial import commit | Flag — can't attribute |
| Moved file | Blame stops at the git mv | git blame -C -M to follow moves/copies |
| Blamed commit is newer than bug report | The bug existed before that commit; blame is wrong | Discard candidates with commit-date > bug-report-date |
Fix commit: c4a9f1b — Fix: null check in getUserEmail (closes #892)
public String getUserEmail(long id) {
User u = repo.find(id);
- return u.getEmail();
+ if (u == null) return null;
+ return u.getEmail();
}Step 2: Modified line is return u.getEmail(); (the old version).
Step 3: git blame c4a9f1b^ -- UserService.java at that line → a17d3e0 — Add UserService (Jane, 2021-03-04).
Step 4: Candidate = a17d3e0.
Filters:
a17d3e0 is 2021. ✓Verdict: a17d3e0 introduced the bug. The null-check was never there.
semantic-szz-analyzer to distinguish.git revert — the reverted commit IS the bug-introducer, definitionally. Shortcut: check if the fix is a revert before running SZZ.(cherry picked from commit …) trailers and follow the chain..git-blame-ignore-revs. Without it, every result blames the last formatting pass.fix, bug) as your only fix-identification signal. False-positive rate is brutal. Prefer issue-tracker links.regression-root-cause-analyzer is for (bisect is more precise). SZZ is for batch mining.fix: <sha> — <subject>
candidates:
<sha> — <subject> (<date>, <author>)
blamed from: <file>:<line>
filters passed: whitespace ✓ comment ✓ date ✓
confidence: <high|medium|low>
...47d56bb
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.