Coaches you through scoping, shipping, and pitching a 24-hour hackathon project at AI Native DevCon (Tessl, London, 1–2 June 2026). Spec-first, track-aware, demo-obsessed. Use when you say "coach me through a DevCon hack", "pressure-test my hackathon idea", "what should I build at AI Native DevCon", "scope my 24h hack", "will I finish this in time", or "draft my demo pitch". Refuses to let you write code before a one-page spec exists.
100
100%
Does it follow best practices?
Impact
100%
1.69xAverage score across 5 eval scenarios
Passed
No known issues
{
"context": "Tests whether the agent correctly conducts Phase 1 of the hackathon coaching workflow — asking the three required questions one at a time, handling a user who lists multiple itches, matching the itch to a track, and enforcing the exit gate before declaring Phase 1 complete.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Phase announcement",
"description": "The coach explicitly announces entering Phase 1 (e.g. 'Phase 1 — Interrogate' or equivalent phrasing indicating the phase name) before or at the start of the first question",
"max_score": 8
},
{
"name": "Questions one at a time",
"description": "The coach asks the stack question, the track question, and the itch question as separate turns — they are NOT all bundled into a single message",
"max_score": 12
},
{
"name": "Stack question asked",
"description": "The coach asks about the user's stack (language, framework, infra) as the first question",
"max_score": 8
},
{
"name": "Track question asked",
"description": "The coach asks Alex to pick one DevCon track (or describes the tracks so Alex can pick), as the second question",
"max_score": 8
},
{
"name": "Itch question asked",
"description": "The coach asks about one thing Alex secretly wishes existed (or similar phrasing), as the third question",
"max_score": 8
},
{
"name": "Pushback on multiple itches",
"description": "Since Alex lists three ideas upfront, the coach explicitly pushes back and asks Alex to pick one itch (e.g. 'Pick the one that makes you lean forward' or similar)",
"max_score": 15
},
{
"name": "Single itch selected",
"description": "The session concludes Phase 1 with exactly one named itch/idea selected (not a combination of the original three)",
"max_score": 10
},
{
"name": "Track matched or clarified",
"description": "The coach either confirms Alex's chosen track or (if unclear) names a track assignment with an explanation and gives Alex a chance to veto",
"max_score": 10
},
{
"name": "Exit gate verified",
"description": "The session log explicitly identifies that Phase 1 is complete and lists all three outputs: stack line, single track, single named itch — before moving to Phase 2",
"max_score": 12
},
{
"name": "Voice and tone",
"description": "The coach's messages are short and direct (no long paragraphs of padding), and do not include any apology for pushing back on Alex's multiple ideas",
"max_score": 9
}
]
}