Four-skill presentation system: ingest talks into a rhetoric vault, run interactive clarification, generate a speaker profile, then create new presentations that match your documented patterns. Includes an 88-entry Presentation Patterns taxonomy for scoring, brainstorming, and go-live preparation.
96
93%
Does it follow best practices?
Impact
97%
1.21xAverage score across 30 eval scenarios
Advisory
Suggest reviewing before use
{
"context": "Tests whether the agent builds a talk tracking database with correct structure, status logic, file skip patterns, and slide source determination rules.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Top-level DB structure",
"description": "tracking-database.json has clearly separated sections for: configuration, talk entries (array), presentation file catalog (array), and a place for confirmed user intents or preferences",
"max_score": 10
},
{
"name": "Config captures source directories",
"description": "The config section records the paths to the talks source directory and the presentations source directory, so the scan is reproducible",
"max_score": 5
},
{
"name": "Talk fields extracted",
"description": "Each talk entry includes at least: filename, title, conference, date, and status",
"max_score": 8
},
{
"name": "URL fields parsed",
"description": "Talk entries correctly parse video_url and slides_url from the markdown front matter, and extract youtube_id and/or google_drive_id from the URLs",
"max_score": 10
},
{
"name": "New talks default to pending",
"description": "New talks discovered from markdown files get a status indicating they are waiting to be processed — distinct from processed, failed, or skipped statuses",
"max_score": 8
},
{
"name": "No-sources talk flagged as unprocessable",
"description": "The 'AI-Assisted Testing' talk (which has no video_url and no slides) is flagged with a status indicating it cannot be processed due to missing sources — not left in the default pending state",
"max_score": 10
},
{
"name": "Slide source determination",
"description": "Talks are assigned a slide source value based on what's available: talks with both PDF slides and PPTX get a value reflecting both; talks with only one get the appropriate single-source value",
"max_score": 10
},
{
"name": "Static files skipped",
"description": "The presentation catalog or processing log shows that 'DevOps Reframed static.pptx' was skipped/excluded",
"max_score": 8
},
{
"name": "Conflict copies skipped",
"description": "The presentation catalog or processing log shows that 'DevOps Reframed (1).pptx' was skipped/excluded",
"max_score": 8
},
{
"name": "Template files skipped",
"description": "The presentation catalog or processing log shows that the template file was skipped/excluded",
"max_score": 8
},
{
"name": "PPTX-to-talk matching",
"description": "The presentation catalog entries include a reference to their matched talk (by filename or title), with at least 2 correctly matched",
"max_score": 5
},
{
"name": "Scan report with counts",
"description": "The scan report includes counts of talks found, processable talks, and presentation file cataloging stats",
"max_score": 5
}
]
}evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10
scenario-11
scenario-12
scenario-13
scenario-14
scenario-15
scenario-16
scenario-17
scenario-18
scenario-19
scenario-20
scenario-21
scenario-22
scenario-23
scenario-24
scenario-25
scenario-26
scenario-27
scenario-28
scenario-29
scenario-30
rules
skills
presentation-creator
references
patterns
build
deliver
prepare
scripts
vault-clarification
vault-ingress
vault-profile