Automated pipeline that takes a company name and produces a custom Tessl skill plus an eval report showing per-scenario lift (baseline agent vs with-skill agent). A1 MVP cell of the produce/consume × personalization 2x2.
88
86%
Does it follow best practices?
Impact
89%
1.45xAverage score across 13 eval scenarios
Advisory
Suggest reviewing before use
parse-attendees.py invoked
0%
93%
Empty-result reported as zero usable rows
0%
0%
No batch-manifest.json written
100%
100%
No per-company subdirectories
100%
100%
No index.md fabricated
100%
100%
Re-export guidance present
0%
58%
Source CSV named
100%
100%
Header row noted
0%
0%
Selection validation and slug resolution
acme-corp: latest run selected
100%
100%
acme-corp: mode defaults to consume
0%
100%
beta-inc: stops with defer rationale
100%
100%
beta-inc: no scaffold command
100%
100%
acme-corp: correct --path in skill new
0%
100%
acme-corp: --name from target title
0%
100%
acme-corp: --description flag present
0%
100%
beta-inc: rationale quoted
100%
100%
acme-corp: run_dir identified
100%
100%
acme-corp: target title in log
100%
100%
Pipeline command accuracy
tessl skill new flags
100%
100%
Remove before scaffold
100%
100%
Scenario generate flags
50%
100%
Scenario download last flag
0%
100%
Negative case scenario
0%
0%
Skill review threshold
0%
100%
Review retry cap
0%
100%
Eval variant without-context
100%
100%
Eval variant with-context
100%
100%
Eval agent flag
0%
100%
compute-lift.py invocation
0%
100%
render-report.py invocation
0%
100%
No generated-skill scaffold
100%
0%
No tessl scenario generate
100%
0%
No tessl skill review invocation
100%
0%
No eval execution
100%
100%
selection_status surfaced in run-log.md
0%
0%
selection_rationale surfaced
0%
0%
No fabricated lift numbers
0%
100%
Run-log.md exists and explains the halt
0%
12%
UNKNOWN protocol, tagline and DBA rules
Tagline resolved to company
100%
100%
DBA pair bucketed once
100%
100%
DBA entity correctly routed
100%
100%
Tagline resolution documented
100%
100%
Minimal UNKNOWN count
100%
100%
Newsletter/non-engineering in SELF_OR_NA
100%
100%
Four sections in order
25%
100%
RUN_DISCOVERY alphabetized
0%
100%
Dedup output file
100%
100%
Summary line format
0%
100%
Sector-neutral triage and report format
Dedup output file
100%
100%
Count delta reported
100%
100%
Four sections in order
100%
100%
No rationale in MEGA_CORP
0%
100%
No rationale in SELF_OR_NA
0%
100%
RUN_DISCOVERY alphabetized
100%
100%
Banks and consultancies routed to RUN_DISCOVERY
100%
100%
Obvious MEGA_CORP identified
100%
100%
Obvious SELF_OR_NA identified
100%
100%
MEGA_CORP sub-brand note
62%
100%
Summary line format
0%
100%
triage-skipped.md exists
100%
100%
Dedup yield reported as zero
13%
66%
Input filename named
100%
100%
Re-export guidance present
100%
100%
No triage-report.md written
100%
100%
No fabricated four-section report
100%
100%
dedup-output.json reflects empty result
20%
40%
Hidden multi-brand active check
ZEISS as MEGA_CORP
100%
100%
IKEA as MEGA_CORP
100%
100%
Schibsted as MEGA_CORP
100%
100%
McKinsey as MEGA_CORP
0%
100%
Monterro as MEGA_CORP
0%
100%
Capital One in RUN_DISCOVERY
100%
100%
Rabobank in RUN_DISCOVERY
100%
100%
Thoughtworks in RUN_DISCOVERY
100%
100%
Robert Bosch as MEGA_CORP
100%
100%
MEGA_CORP sub-brand note
50%
100%
Summary line format
0%
100%
NEA in SELF_OR_NA
0%
100%
AMBIGUOUS verdict for hidden multi-brand company
verdict AMBIGUOUS
15%
100%
sub_brands_detected present
20%
100%
sub_brands minimum count
66%
100%
no skill_targets
100%
100%
sources present
0%
100%
re-invocation guidance
70%
100%
intake-notes.md exists
100%
100%
company field present
50%
100%
verdict SKIP
0%
100%
skip_reason names consumer-side probes
0%
100%
search_attempted populated
0%
100%
would_change_verdict_if populated
0%
100%
no INTEGRATE/external inflation
100%
100%
sources present
0%
100%
intake-notes.md exists
100%
100%
company field present
50%
100%
AMBIGUOUS verdict for multi-brand entity
verdict AMBIGUOUS
100%
100%
sub_brands_detected present
100%
100%
Multiple sub-brands listed
100%
100%
Sub-brand entries have name field
100%
100%
guidance field present
100%
100%
guidance mentions parent/sub-brand
50%
50%
No non-empty skill_targets
100%
100%
schema_version 3 present
100%
100%
mode: produce present
100%
100%
No auto-resolved sub-brand chosen
100%
100%
Slug-based lookup, produce-mode ranking, skip selection
Most-recent run selected
100%
100%
Low-confidence target absent
0%
100%
Correct rank order by raw confidence
100%
100%
No booth-aha formula applied
100%
100%
No consume-mode columns
100%
100%
Common columns present
62%
100%
selection.json in run directory
100%
100%
selection_status is skipped
100%
100%
selected_target_id is null
0%
100%
selection_rationale is non-empty
100%
100%
schema_version and selected_at present
100%
100%
Consume-mode booth-aha scoring and selection persistence
Low-confidence target dropped
0%
100%
Correct rank order
80%
100%
Booth-aha score column present
30%
100%
Consume-mode columns included
40%
100%
Common columns present
100%
100%
selection.json in inputs/
0%
100%
schema_version is 1
100%
100%
discovery_path is absolute
0%
100%
selection_status is selected
100%
100%
selected_at is ISO-8601 UTC
100%
100%
Validator script run
0%
0%
Table of Contents