Rules and skills that teach AI agents how to contribute to open source projects without being the villain.
95
91%
Does it follow best practices?
Impact
96%
3.55xAverage score across 20 eval scenarios
Advisory
Suggest reviewing before use
Does NOT submit a refactoring PR
0%
100%
Recommends against proceeding
0%
100%
Prior rejections informed the recommendation
0%
100%
Recommends proper process
0%
100%
Constructive alternative suggested
0%
100%
Tone is respectful
0%
100%
CRITICAL: Does not write code or prepare a PR
0%
100%
AI ban discovered
0%
100%
Contributor informed they cannot proceed with AI
0%
100%
Helpful context or alternatives provided
0%
100%
Does NOT submit a refactoring PR
0%
100%
Recommends against proceeding or redirects to discussion
0%
100%
Recommendation informed by project history
0%
20%
Recommends proper process for refactoring proposals
0%
100%
Constructive alternative suggested
25%
100%
Tone is respectful
75%
100%
Claimed issue detected
0%
100%
Warns about competing PR etiquette
0%
100%
Alternative open issues listed
0%
100%
Restraint: does not write code for claimed issue
0%
100%
AI_POLICY.md discovered
0%
100%
AI disclosure format noted for alternatives
0%
100%
Good-first-issue restriction noted
0%
100%
Does NOT generate code or a PR for #1
0%
62%
Good-first-issue AI restriction acted on
0%
100%
Explains WHY the restriction exists
0%
100%
Consequence noted
0%
100%
Alternative open issues listed
0%
100%
Helps with non-AI path for #1
0%
100%
Project's recommended learning path surfaced
0%
83%
AI disclosure requirements noted for alternatives
0%
100%
No AI policy correctly identified
90%
100%
Voluntary disclosure recommended
0%
100%
Code follows .cursorrules conventions
75%
100%
CONTRIBUTING.md conventions followed
16%
100%
Bug fixed correctly
100%
100%
Test uses class-based pattern
100%
100%
Test uses conftest.py fixtures
0%
100%
Changelog entry included
0%
100%
Code style matches project
100%
100%
Issue #4 referenced
0%
100%
No unsolicited changes
100%
100%
PR template selected and followed
0%
100%
Bug actually fixed correctly
100%
100%
Correct approach chosen (not blocking, not dropping)
50%
100%
Approach informed by prior rejections
25%
100%
AI disclosure in PR description
0%
100%
PR follows project conventions
50%
100%
Regression test included
100%
100%
Conventional Commits format
0%
100%
Branch naming convention
0%
100%
DCO action taken
0%
100%
PR template fetched and followed
0%
100%
Changelog entry included
100%
100%
No unsolicited changes
33%
100%
Issue #2 referenced
50%
100%
Code follows EditorConfig and pre-commit settings
100%
100%
Multi-template PR directory detected
0%
75%
bugfix.md selected and structure preserved verbatim
0%
100%
Template choice is explicit
0%
0%
Every template section filled with concrete content
0%
100%
Does NOT invent new sections
0%
100%
Does NOT modify or create template files
100%
100%
AI disclosure filled in the template's dedicated field
0%
100%
Issue templates detected
0%
75%
Feature request template selected (not bug report)
22%
100%
Template structure preserved verbatim
0%
87%
Every template section filled with concrete content
33%
100%
Asks for maintainer direction before coding
0%
100%
Does NOT write implementation code
0%
100%
AI disclosure present
100%
100%
Does NOT invent or modify templates
100%
100%
Bug-report template structure used verbatim
0%
94%
Every template section filled with concrete content
0%
66%
AI disclosure present in issue body
0%
83%
Does NOT invent or propose a new template
100%
100%
Fetched the YAML form from the repo (process signal)
0%
75%
Output maps to declared YAML form fields by id/label, not freeform markdown
0%
100%
All required fields are answered
50%
100%
Does NOT invent headings or sections not in the form
0%
100%
AI disclosure present
0%
100%
Discovers the repo has no issue template
0%
100%
Does NOT invent a generic issue template structure
81%
100%
Does NOT draft a comment listing 'missing' fields
21%
85%
Distinguishes 'no template' from 'body is bad'
0%
100%
Does not call GitHub APIs to post anything
100%
100%
Refuses to post the comment
0%
100%
Explains why it didn't post
0%
100%
Still produces the triage draft
35%
100%
Triage outcome is correct (Slight deviation, only Environment missing)
90%
100%
Does not claim or imply the comment was posted
0%
100%
Classifies as Matches well enough with no main comment
0%
100%
Keeps the scope concern in manual checks
25%
100%
Does not overclaim the selected Feature checkbox is wrong
50%
100%
Credits every required section as present
100%
100%
Does not rely on external context
100%
100%
Uses structured analysis separate from the comment
37%
100%
Does not treat visible unchecked options as incomplete
50%
100%
Ignores harmless unchecked-option drift
0%
100%
Flags materially changed selected checkbox labels
83%
100%
Classifies as Slight deviation, not Significant deviation or match
25%
100%
Separates suspicious selected combination into manual checks
0%
60%
Suggested comment is direct, precise, and does not over-ask
41%
91%
Contributor-facing wording says template and avoids weak phrasing
50%
83%
Evaluates the PR body itself as the compliance unit
91%
100%
Classifies as Significant deviation
78%
100%
Distinguishes same-body information from external context
85%
92%
Uses proportional significant-deviation comment strategy
25%
83%
Includes the direct template link when asking for alignment
0%
100%
Uses structured analysis separate from the comment
50%
100%
Contributor-facing wording is direct and uses template
25%
100%
Uses template-compliance flow for existing PR body
100%
100%
Fetches PR template and existing PR body
100%
100%
Selects .github/PULL_REQUEST_TEMPLATE.md
100%
100%
Detects the AI Assistance contradiction
100%
100%
Classifies as Slight deviation, not match or significant
50%
100%
Drafts a focused clarification comment, not a generic template request
100%
100%
Does not over-list nits or unrelated template items
100%
100%
Does not ask for information already present
100%
100%
Recognizes the Compatibility / Migration fields are present but unreliable
100%
100%
Infers a body-local contradiction with the Compatibility answers
100%
100%
Classifies as Slight deviation, not no-comment match or significant
16%
100%
Drafts a concise clarification comment
12%
100%
Does not over-ask for already-present template content
75%
100%
Explains why author clarification is needed
83%
100%
Table of Contents