Evaluates SKILL.md submissions for the AI Engineer London 2026 Skills Contest across 11 dimensions (8 official Tessl rubric + 3 bonus). Use when you say 'judge my AIE26 contest skill', 'score this SKILL.md for the contest', 'review my skill submission', or 'how would this score on the leaderboard'. Accepts GitHub repo URLs, file paths, or raw pastes.
82
94%
Does it follow best practices?
Impact
65%
1.80xAverage score across 5 eval scenarios
Risky
Do not use without reviewing
Output format and scoring formula
Receipt confirmation format
0%
0%
Phase 1 display line
0%
0%
Structure pass message
0%
0%
Core score table present
0%
100%
Scores as X/3
0%
100%
Core score formula applied
0%
100%
Bonus score table present
0%
100%
Bonus score format
0%
100%
Detailed feedback for all 11 dimensions
0%
100%
Verdict present
0%
100%
Verdict mentions highest-leverage improvement
33%
100%
Structural validation blocking
Receipt confirmation present
0%
0%
Phase 1 display line
0%
0%
Structural Issues header format
0%
0%
Trigger language issue identified
0%
42%
Fix instruction provided
0%
41%
Numbered issue list
0%
0%
Resubmit instruction
100%
0%
Structure passed message absent
100%
100%
No core scoring table
100%
0%
No core score percentage
100%
0%
Scope enforcement and refusal
Editing request refused
0%
100%
No edited skill content
0%
100%
Contest logistics refused
100%
100%
No contest logistics answered
100%
66%
Offers evaluation instead
0%
83%
No ranking produced
100%
100%
No CLI execution claimed
100%
100%
Refusal is clear not evasive
0%
80%
Response remains brief
100%
0%
No non-AIE26 tangent
50%
100%
Edge cases: short skill and batch
Short skill flagged as incomplete
66%
100%
Short skill still scored
100%
100%
Short skill receipt confirmation
0%
0%
Short skill Phase 1 display
0%
0%
First batch skill complete scorecard
100%
100%
Second batch skill receipt confirmation
0%
0%
Second batch skill Phase 1 display
0%
0%
Second batch skill complete scorecard
70%
100%
Two separate scorecards
66%
100%
Scorecards produced sequentially
100%
100%
Reference loading and calibration
Calibration example shown
75%
100%
Receipt confirmation for submitted skill
0%
0%
Phase 1 display line
0%
0%
Rubric level language used
25%
100%
Evidence quoted in scoring
100%
100%
Rubric criteria applied correctly
16%
66%
All 11 dimensions scored
0%
100%
Detailed feedback for all 11 dimensions
50%
100%
Core score formula applied
0%
100%
Verdict present and specific
87%
100%