Four-skill presentation system: ingest talks into a rhetoric vault, run interactive clarification, generate a speaker profile, then create new presentations that match your documented patterns. Includes an 88-entry Presentation Patterns taxonomy for scoring, brainstorming, and go-live preparation.
96
93%
Does it follow best practices?
Impact
97%
1.21xAverage score across 30 eval scenarios
Advisory
Suggest reviewing before use
{
"context": "Tests whether the agent correctly processes a multilingual tech conference talk into a rhetoric knowledge vault — covering the 14-dimension analysis format, English-only language policy for non-English quotes, observable-only pattern tagging with the correct scoring formula, and the required structured output files.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Analysis file location",
"description": "The per-talk rhetoric analysis file is written inside an `analyses/` subdirectory of the vault root (e.g., vault/analyses/2024-devoxx-jvm-myths.md), not in the vault root itself",
"max_score": 5
},
{
"name": "All 14 dimensions present",
"description": "The analysis file contains coverage of all 14 dimensions: Opening Pattern, Narrative Structure, Humor & Wit, Audience Interaction, Transition Techniques, Closing Pattern, Verbal Signatures, Slide-to-Speech Relationship, Persuasion Techniques, Cultural References, Technical Content Delivery, Pacing Clues, Slide Design Patterns, and Areas for Improvement",
"max_score": 15
},
{
"name": "English-only quote format",
"description": "Russian phrases from the transcript (e.g., 'Ну, JVM же медленная' or 'Получается что...') appear in the analysis with BOTH an English translation AND the Russian original — the English translation must come before the Russian text, not after it, and the Russian must not appear without a translation. Check that at least one such pair exists.",
"max_score": 10
},
{
"name": "Language-tagged verbal signatures",
"description": "Russian verbal signatures are stored separately from the English signature list and tagged with language code — e.g., [ru] \"ну и\" — NOT merged into the main English verbal signatures section",
"max_score": 8
},
{
"name": "Structured data fields",
"description": "The analysis or tracking DB entry includes all required structured_data fields: slide_count, talk_duration_estimate, meme_count, image_only_slide_count, audience_interaction_count, opening_type, closing_type, narrative_arc_type",
"max_score": 8
},
{
"name": "Pattern score formula",
"description": "The per-talk pattern_score is calculated as count(patterns) minus count(antipatterns), not a sum or average of the individual pattern values",
"max_score": 10
},
{
"name": "Observable patterns only",
"description": "None of the following unobservable patterns appear in the pattern_observations scored section: preparation, carnegie-hall, stakeout, posse, seeding-satisfaction, shoeless, lightsaber, red-yellow-green, laser-weapons, bunker, backchannel",
"max_score": 8
},
{
"name": "Verbatim examples",
"description": "The analysis includes a verbatim_examples section with at least three populated categories from: signature_phrases, jokes, transitions, audience_addresses, opening_lines, closing_lines",
"max_score": 9
},
{
"name": "Tracking DB updated",
"description": "tracking-database.json exists and contains the processed talk with status set to either 'processed' or 'processed_partial' (not 'pending'), and includes a processed_date value",
"max_score": 10
},
{
"name": "Areas for improvement present",
"description": "The analysis file contains a section for dimension 14 (Areas for Improvement / Reflection) that identifies at least two specific weaknesses or risks in the talk's delivery — not just a placeholder, empty section, or single generic statement",
"max_score": 9
},
{
"name": "Presentation Patterns Scoring section",
"description": "The analysis file contains a 'Presentation Patterns Scoring' section listing at least one detected pattern or antipattern with confidence level (strong/moderate/weak) and evidence",
"max_score": 8
}
]
}evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10
scenario-11
scenario-12
scenario-13
scenario-14
scenario-15
scenario-16
scenario-17
scenario-18
scenario-19
scenario-20
scenario-21
scenario-22
scenario-23
scenario-24
scenario-25
scenario-26
scenario-27
scenario-28
scenario-29
scenario-30
rules
skills
presentation-creator
references
patterns
build
deliver
prepare
scripts
vault-clarification
vault-ingress
vault-profile