Coaches you through scoping, shipping, and pitching a 24-hour hackathon project at AI Native DevCon (Tessl, London, 1–2 June 2026). Spec-first, track-aware, demo-obsessed. Use when you say "coach me through a DevCon hack", "pressure-test my hackathon idea", "what should I build at AI Native DevCon", "scope my 24h hack", "will I finish this in time", or "draft my demo pitch". Refuses to let you write code before a one-page spec exists.
100
100%
Does it follow best practices?
Impact
100%
1.69xAverage score across 5 eval scenarios
Passed
No known issues
{
"context": "Tests whether the agent correctly runs Phase 4 of the coaching workflow — producing exactly three pitch sentences each under 20 words, generating all five standard judge Q&A pairs, delivering the exact terminal state phrase, and not offering further implementation help after coaching ends.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Phase announcement",
"description": "The coach explicitly announces Phase 4 by name (e.g. 'Phase 4 — Pitch it' or equivalent) before starting",
"max_score": 5
},
{
"name": "Priya's draft rejected",
"description": "The coach rejects or pushes back on Priya's verbose opening pitch draft rather than accepting it as-is",
"max_score": 8
},
{
"name": "Exactly three sentences",
"description": "The final pitch in pitch.md contains exactly three sentences (wedge, move, moment) — not more",
"max_score": 10
},
{
"name": "Sentence 1 under 20 words",
"description": "The wedge sentence (Sentence 1) in pitch.md is 20 words or fewer",
"max_score": 8
},
{
"name": "Sentence 2 under 20 words",
"description": "The move sentence (Sentence 2) in pitch.md is 20 words or fewer",
"max_score": 8
},
{
"name": "Word counts shown",
"description": "pitch.md shows the word count for each of the three sentences (demonstrating they were counted)",
"max_score": 6
},
{
"name": "Sentence 3 is 'Watch this'",
"description": "The moment sentence (Sentence 3) is 'Watch this.' or a close equivalent pointing to the live demo",
"max_score": 6
},
{
"name": "Five standard Q&A present",
"description": "pitch.md includes all five standard judge questions: scale, existing tool comparison, LLM hallucination, who pays, and moat",
"max_score": 10
},
{
"name": "Q&A answers one sentence each",
"description": "Every judge Q&A answer in pitch.md is a single sentence (not multiple sentences or bullet points)",
"max_score": 8
},
{
"name": "Terminal state phrase exact",
"description": "The session log contains the exact terminal phrase: 'You have a spec, a plan, and a pitch. Stop planning. Go build. You have 24 hours.' (may be quoted or paraphrased very closely)",
"max_score": 12
},
{
"name": "No implementation help offered",
"description": "After the terminal state, the coach does NOT offer help with implementation, coding, or next steps beyond the coaching scope",
"max_score": 10
},
{
"name": "Dry-run checklist in pitch.md",
"description": "pitch.md includes a dry-run checklist with all items checked off (sentences under 20 words, Q&A complete, etc.)",
"max_score": 9
}
]
}