CtrlK
BlogDocsLog inGet started
Tessl Logo

sentry-backend-bugs

Review Sentry Python and Django changes for bug patterns drawn from real production issues. Use when reviewing a backend diff or PR, checking Warden findings, auditing the current branch, reviewing production-error patterns, or looking for common regressions in `src/` and `tests/`.

87

0.97x
Quality

83%

Does it follow best practices?

Impact

94%

0.97x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

SKILL.md
Quality
Evals
Security

Quality

Discovery

89%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a strong description with excellent completeness and distinctiveness, clearly scoped to Sentry Python/Django code review with explicit trigger scenarios. Its main weakness is that the 'what' portion could be more specific about the concrete bug patterns or actions performed, rather than staying at the level of 'review for bug patterns'. The trigger terms are well-chosen and cover natural user language.

Suggestions

Add specific examples of the bug patterns detected (e.g., 'Detects N+1 queries, missing error handlers, unsafe migrations, race conditions') to improve specificity from general 'bug patterns' to concrete actions.

DimensionReasoningScore

Specificity

It names the domain (Sentry Python/Django) and the general action (review for bug patterns from production issues), but doesn't list specific concrete actions like 'detect N+1 queries, flag missing error handling, check for race conditions'. The actions remain at a high level ('review', 'check', 'audit').

2 / 3

Completeness

Clearly answers both 'what' (review Sentry Python and Django changes for bug patterns from real production issues) and 'when' (explicit 'Use when' clause listing five specific trigger scenarios: reviewing diffs/PRs, checking Warden findings, auditing branches, reviewing production-error patterns, looking for regressions).

3 / 3

Trigger Term Quality

Includes strong natural trigger terms users would say: 'diff', 'PR', 'Warden findings', 'production-error patterns', 'regressions', 'src/', 'tests/', 'backend', 'Sentry', 'Django'. These cover multiple natural ways a user might phrase their request.

3 / 3

Distinctiveness Conflict Risk

Highly distinctive due to the specific niche of Sentry Python/Django bug pattern review, references to Warden findings, and specific paths like 'src/' and 'tests/'. Unlikely to conflict with generic code review or other language-specific skills.

3 / 3

Total

11

/

12

Passed

Implementation

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a strong, well-structured skill that encodes specific production knowledge into an actionable review workflow. Its greatest strength is the concrete, pattern-based checks with real code examples for both red flags and safe patterns, plus explicit guidance on what NOT to flag (reducing false positives). The main weakness is that it's somewhat long for a SKILL.md — the detailed checks could potentially live in the referenced files — and the referenced bundle files are not provided, making it impossible to verify the progressive disclosure structure works end-to-end.

Suggestions

Consider moving the detailed check descriptions (red flags, safe patterns) into the referenced files and keeping only a summary table with one-line descriptions in SKILL.md, since the references are already defined in Step 1.

Trim the preamble statistics ('638 real production issues', '27 million error events') — Claude doesn't need persuasion to follow instructions, and this adds tokens without changing behavior.

DimensionReasoningScore

Conciseness

The skill is fairly long (~300 lines) but most content earns its place — the pattern checks encode specific, non-obvious production knowledge Claude wouldn't have. However, the preamble about '638 real production issues' and event counts, while lending credibility, is context Claude doesn't need to act on. Some checks repeat similar safe patterns (e.g., try/except DoesNotExist appears in multiple checks). The 'Not a bug' callouts are valuable and prevent false positives, which justifies their inclusion.

2 / 3

Actionability

Each check provides specific red flags with concrete code patterns (e.g., `Model.objects.get(id=some_id)` without try/except), concrete safe patterns with actual code snippets, and clear guidance on what to report vs. skip. The confidence table gives precise criteria for action. Fix suggestions are required to include actual code. The instruction to trace data flow using Read and Grep is concrete and executable.

3 / 3

Workflow Clarity

The three-step workflow (Classify → Check Patterns → Report) is clearly sequenced with explicit decision points. Step 1 has a classification table mapping code types to references. Step 2 is ordered by impact. The confidence table provides clear validation criteria (HIGH/MEDIUM/LOW with specific actions). The instruction to stop and report zero findings when nothing matches is an important validation checkpoint that prevents false positives.

3 / 3

Progressive Disclosure

The skill references 8 external reference files (e.g., `references/missing-records.md`, `references/null-and-type-errors.md`) which is good progressive disclosure design. However, no bundle files were provided, so we cannot verify these references exist or are well-structured. The main SKILL.md itself is quite long — some of the detailed check patterns could potentially live in the referenced files rather than being duplicated in the main body, though having them inline does make the skill self-contained for quick scanning.

2 / 3

Total

10

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository
getsentry/sentry
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.