Name: paker-it/aie26-skill-judge
Rating: 82.39999999999999 (1 reviews)
Author: paker-it

paker-it/aie26-skill-judge

Evaluates SKILL.md submissions for the AI Engineer London 2026 Skills Contest across 11 dimensions (8 official Tessl rubric + 3 bonus). Use when you say 'judge my AIE26 contest skill', 'score this SKILL.md for the contest', 'review my skill submission', or 'how would this score on the leaderboard'. Accepts GitHub repo URLs, file paths, or raw pastes.

1.80x

Quality

94%

Does it follow best practices?

Impact

65%

1.80x

Average score across 5 eval scenarios

Securityby

Risky

Do not use without reviewing

Evaluation results

78%

74%

Evaluate an AIE26 Contest Submission

Output format and scoring formula

Criteria

Without context

With context

Receipt confirmation format

Phase 1 display line

Structure pass message

Core score table present

100%

Scores as X/3

100%

Core score formula applied

100%

Bonus score table present

100%

Bonus score format

100%

Detailed feedback for all 11 dimensions

100%

Verdict present

100%

Verdict mentions highest-leverage improvement

33%

100%

19%

-17%

Score This Contest Submission

Structural validation blocking

Criteria

Without context

With context

Receipt confirmation present

Phase 1 display line

Structural Issues header format

Trigger language issue identified

42%

Fix instruction provided

41%

Numbered issue list

Resubmit instruction

100%

Structure passed message absent

100%

No core scoring table

100%

No core score percentage

100%

84%

33%

Help with My Contest Submission

Scope enforcement and refusal

Criteria

Without context

With context

Editing request refused

100%

No edited skill content

100%

Contest logistics refused

100%

No contest logistics answered

100%

66%

Offers evaluation instead

83%

No ranking produced

100%

No CLI execution claimed

100%

Refusal is clear not evasive

80%

Response remains brief

100%

No non-AIE26 tangent

50%

100%

68%

12%

Evaluate a Batch of Contest Submissions

Edge cases: short skill and batch

Criteria

Without context

With context

Short skill flagged as incomplete

66%

100%

Short skill still scored

100%

Short skill receipt confirmation

Short skill Phase 1 display

First batch skill complete scorecard

100%

Second batch skill receipt confirmation

Second batch skill Phase 1 display

Second batch skill complete scorecard

70%

100%

Two separate scorecards

66%

100%

Scorecards produced sequentially

100%

80%

44%

Evaluate My Skill — But First, Show Me What Good Looks Like

Reference loading and calibration

Criteria

Without context

With context

Calibration example shown

75%

100%

Receipt confirmation for submitted skill

Phase 1 display line

Rubric level language used

25%

100%

Evidence quoted in scoring

100%

Rubric criteria applied correctly

16%

66%

All 11 dimensions scored

100%

Detailed feedback for all 11 dimensions

50%

100%

Core score formula applied

100%

Verdict present and specific

87%

100%

Evaluated: 23 days ago
Agent: Claude Code
Model: Claude Sonnet 4.6

Table of Contents

Evaluate an AIE26 Contest Submission Score This Contest Submission Help with My Contest Submission Evaluate a Batch of Contest Submissions Evaluate My Skill — But First, Show Me What Good Looks Like