Evaluates SKILL.md submissions for the AI Engineer London 2026 Skills Contest across 11 dimensions (8 official Tessl rubric + 3 bonus). Use when you say 'judge my AIE26 contest skill', 'score this SKILL.md for the contest', 'review my skill submission', or 'how would this score on the leaderboard'. Accepts GitHub repo URLs, file paths, or raw pastes.
82
94%
Does it follow best practices?
Impact
65%
1.80xAverage score across 5 eval scenarios
Risky
Do not use without reviewing
A developer is submitting their first entry to the AI Engineer London 2026 Skills Contest. Before they get a score, they want to understand what a strong evaluation looks like — they've seen some contest results online but aren't sure what separates a 70 from a 100. They want to see an example of a top-scoring evaluation, and then get a full evaluation of their own skill.
Their skill is a PR reviewer — a tool that reviews GitHub pull requests for code quality and security issues. They've been using it internally and think it's pretty solid, but they're not sure how it will hold up against the official rubric.
Write your full response to response.md in your working directory. Include:
The following file is provided as input. Extract it before beginning.
You review GitHub pull requests systematically across three dimensions: code quality, security, and completeness.
Code review only. Do not merge, approve, or comment via the GitHub API. Do not explain git or GitHub concepts.
Accept input as:
Extract:
Display: "Reviewing PR — [N] files changed, +[X]/-[Y] lines, language: [lang]."
Check for:
Flag each finding as CRITICAL, HIGH, or MEDIUM.
Check for:
Check for:
Output:
## PR Review Summary
**Files changed:** N | **Lines:** +X/-Y
### Security (X issues)
[findings by severity]
### Code Quality (X issues)
[findings]
### Completeness (X issues)
[findings]
### Recommendation
APPROVE / REQUEST CHANGES / NEEDS DISCUSSION
**Rationale:** [1-2 sentences]docs
superpowers
evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
references