Four-skill presentation system: ingest talks into a rhetoric vault, run interactive clarification, generate a speaker profile, then create new presentations that match your documented patterns. Includes an 88-entry Presentation Patterns taxonomy for scoring, brainstorming, and go-live preparation.
96
93%
Does it follow best practices?
Impact
97%
1.21xAverage score across 30 eval scenarios
Advisory
Suggest reviewing before use
{
"context": "Tests whether the agent implements the skill's humor post-mortem, blind spot detection, and recency-adapted clarification session — grounding questions in specific talk observations rather than asking generic questions, grading humor effectiveness, capturing spontaneous moments, and compressing questioning for older talks.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Per-joke grading",
"description": "Each identified humor beat in the debrief report receives an effectiveness grade from a defined set (e.g., 'hit', 'nod', 'flat', or similar landing-quality labels) — not a numeric score or pass/fail",
"max_score": 10
},
{
"name": "Joke questions grounded in observations",
"description": "The debrief questions reference SPECIFIC humor beats from the analysis (verbatim quotes, slide numbers, or meme descriptions) — not generic 'did your jokes land?' questions",
"max_score": 12
},
{
"name": "Meme slide questions",
"description": "For meme-only or meme-with-text slides identified in the analysis, the questionnaire asks specifically whether the audience visibly reacted or whether the meme was talked over",
"max_score": 8
},
{
"name": "Spontaneous humor capture",
"description": "The questionnaire includes at least one question specifically asking about humor that happened OUTSIDE the transcript — improvised callbacks, audience riffs, recovery humor from failures, or unrecorded moments. Must be a distinct question, not folded into a generic 'anything else?' prompt",
"max_score": 10
},
{
"name": "Promotion prompt",
"description": "The report or questionnaire includes a field or question about whether well-received spontaneous humor should be incorporated into future versions of the talk — e.g., a 'promote to planned' flag, a 'reuse recommendation' field, or an explicit question like 'should this become a planned beat?'",
"max_score": 8
},
{
"name": "Blind spot questions",
"description": "The questionnaire asks about at least 2 things the analysis CANNOT observe: room energy, audience size, technical failures, physical stage moments, or audience reactions not captured in transcript",
"max_score": 10
},
{
"name": "Demo reaction questions",
"description": "For demo sections identified in the analysis, the questionnaire asks about audience engagement or reactions during the demo — whether the audience was engaged, reactions to demo outcomes, any failures or recovery moments. Can be a dedicated demo question or included within blind spot / audience reaction questions",
"max_score": 8
},
{
"name": "Recency adaptation — recent",
"description": "For the recent talk (within past week), the debrief contains detailed per-joke and per-interaction questions — at least 5 specific questions about individual humor beats",
"max_score": 10
},
{
"name": "Recency adaptation — old",
"description": "For the old talk (2+ years ago), the debrief is COMPRESSED — fewer questions, broader scope (e.g., 'any jokes you remember landing well or badly?' rather than per-joke grading). Demonstrably shorter than the recent talk debrief",
"max_score": 12
},
{
"name": "Structured output format",
"description": "The debrief report stores humor grades and blind spot observations in a structured format (JSON with named fields) — not just free-text prose",
"max_score": 12
}
]
}evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10
scenario-11
scenario-12
scenario-13
scenario-14
scenario-15
scenario-16
scenario-17
scenario-18
scenario-19
scenario-20
scenario-21
scenario-22
scenario-23
scenario-24
scenario-25
scenario-26
scenario-27
scenario-28
scenario-29
scenario-30
rules
skills
presentation-creator
references
patterns
build
deliver
prepare
scripts
vault-clarification
vault-ingress
vault-profile