CtrlK
BlogDocsLog inGet started
Tessl Logo

jbaruch/auto-skill-discovery

Automated pipeline that takes a company name and produces a custom Tessl skill plus an eval report showing per-scenario lift (baseline agent vs with-skill agent). A1 MVP cell of the produce/consume × personalization 2x2.

88

1.45x
Quality

86%

Does it follow best practices?

Impact

89%

1.45x

Average score across 13 eval scenarios

SecuritybySnyk

Advisory

Suggest reviewing before use

Overview
Quality
Evals
Security
Files

Evaluation results

74%

21%

Run the A1 Batch Driver on the Conference Attendee Export

Criteria
Without context
With context

parse-attendees.py invoked

0%

93%

Empty-result reported as zero usable rows

0%

0%

No batch-manifest.json written

100%

100%

No per-company subdirectories

100%

100%

No index.md fabricated

100%

100%

Re-export guidance present

0%

58%

Source CSV named

100%

100%

Header row noted

0%

0%

100%

45%

Pipeline Queue: Selection Validation and Context Loading

Selection validation and slug resolution

Criteria
Without context
With context

acme-corp: latest run selected

100%

100%

acme-corp: mode defaults to consume

0%

100%

beta-inc: stops with defer rationale

100%

100%

beta-inc: no scaffold command

100%

100%

acme-corp: correct --path in skill new

0%

100%

acme-corp: --name from target title

0%

100%

acme-corp: --description flag present

0%

100%

beta-inc: rationale quoted

100%

100%

acme-corp: run_dir identified

100%

100%

acme-corp: target title in log

100%

100%

92%

54%

Skill Pipeline Onboarding Script

Pipeline command accuracy

Criteria
Without context
With context

tessl skill new flags

100%

100%

Remove before scaffold

100%

100%

Scenario generate flags

50%

100%

Scenario download last flag

0%

100%

Negative case scenario

0%

0%

Skill review threshold

0%

100%

Review retry cap

0%

100%

Eval variant without-context

100%

100%

Eval variant with-context

100%

100%

Eval agent flag

0%

100%

compute-lift.py invocation

0%

100%

render-report.py invocation

0%

100%

23%

-36%

Run the Build-and-Evaluate Pipeline for Vertibrate Industries

Criteria
Without context
With context

No generated-skill scaffold

100%

0%

No tessl scenario generate

100%

0%

No tessl skill review invocation

100%

0%

No eval execution

100%

100%

selection_status surfaced in run-log.md

0%

0%

selection_rationale surfaced

0%

0%

No fabricated lift numbers

0%

100%

Run-log.md exists and explains the halt

0%

12%

100%

22%

Attendee List Triage with Ambiguous Entries

UNKNOWN protocol, tagline and DBA rules

Criteria
Without context
With context

Tagline resolved to company

100%

100%

DBA pair bucketed once

100%

100%

DBA entity correctly routed

100%

100%

Tagline resolution documented

100%

100%

Minimal UNKNOWN count

100%

100%

Newsletter/non-engineering in SELF_OR_NA

100%

100%

Four sections in order

25%

100%

RUN_DISCOVERY alphabetized

0%

100%

Dedup output file

100%

100%

Summary line format

0%

100%

100%

29%

Conference Lead Triage

Sector-neutral triage and report format

Criteria
Without context
With context

Dedup output file

100%

100%

Count delta reported

100%

100%

Four sections in order

100%

100%

No rationale in MEGA_CORP

0%

100%

No rationale in SELF_OR_NA

0%

100%

RUN_DISCOVERY alphabetized

100%

100%

Banks and consultancies routed to RUN_DISCOVERY

100%

100%

Obvious MEGA_CORP identified

100%

100%

Obvious SELF_OR_NA identified

100%

100%

MEGA_CORP sub-brand note

62%

100%

Summary line format

0%

100%

89%

10%

Pre-Discovery Triage on an Exported Attendee List

Criteria
Without context
With context

triage-skipped.md exists

100%

100%

Dedup yield reported as zero

13%

66%

Input filename named

100%

100%

Re-export guidance present

100%

100%

No triage-report.md written

100%

100%

No fabricated four-section report

100%

100%

dedup-output.json reflects empty result

20%

40%

100%

35%

Pre-Discovery Prioritization for an Industry Summit

Hidden multi-brand active check

Criteria
Without context
With context

ZEISS as MEGA_CORP

100%

100%

IKEA as MEGA_CORP

100%

100%

Schibsted as MEGA_CORP

100%

100%

McKinsey as MEGA_CORP

0%

100%

Monterro as MEGA_CORP

0%

100%

Capital One in RUN_DISCOVERY

100%

100%

Rabobank in RUN_DISCOVERY

100%

100%

Thoughtworks in RUN_DISCOVERY

100%

100%

Robert Bosch as MEGA_CORP

100%

100%

MEGA_CORP sub-brand note

50%

100%

Summary line format

0%

100%

NEA in SELF_OR_NA

0%

100%

100%

52%

Initial Discovery Intake: Siemens

AMBIGUOUS verdict for hidden multi-brand company

Criteria
Without context
With context

verdict AMBIGUOUS

15%

100%

sub_brands_detected present

20%

100%

sub_brands minimum count

66%

100%

no skill_targets

100%

100%

sources present

0%

100%

re-invocation guidance

70%

100%

intake-notes.md exists

100%

100%

company field present

50%

100%

100%

72%

Initial Discovery Intake: Baillie Gifford

Criteria
Without context
With context

verdict SKIP

0%

100%

skip_reason names consumer-side probes

0%

100%

search_attempted populated

0%

100%

would_change_verdict_if populated

0%

100%

no INTEGRATE/external inflation

100%

100%

sources present

0%

100%

intake-notes.md exists

100%

100%

company field present

50%

100%

95%

Discovery Pre-flight for a Conference Lead

AMBIGUOUS verdict for multi-brand entity

Criteria
Without context
With context

verdict AMBIGUOUS

100%

100%

sub_brands_detected present

100%

100%

Multiple sub-brands listed

100%

100%

Sub-brand entries have name field

100%

100%

guidance field present

100%

100%

guidance mentions parent/sub-brand

50%

50%

No non-empty skill_targets

100%

100%

schema_version 3 present

100%

100%

mode: produce present

100%

100%

No auto-resolved sub-brand chosen

100%

100%

100%

21%

Target Selection for OrbitLabs

Slug-based lookup, produce-mode ranking, skip selection

Criteria
Without context
With context

Most-recent run selected

100%

100%

Low-confidence target absent

0%

100%

Correct rank order by raw confidence

100%

100%

No booth-aha formula applied

100%

100%

No consume-mode columns

100%

100%

Common columns present

62%

100%

selection.json in run directory

100%

100%

selection_status is skipped

100%

100%

selected_target_id is null

0%

100%

selection_rationale is non-empty

100%

100%

schema_version and selected_at present

100%

100%

92%

45%

Target Selection for StreamlineHQ

Consume-mode booth-aha scoring and selection persistence

Criteria
Without context
With context

Low-confidence target dropped

0%

100%

Correct rank order

80%

100%

Booth-aha score column present

30%

100%

Consume-mode columns included

40%

100%

Common columns present

100%

100%

selection.json in inputs/

0%

100%

schema_version is 1

100%

100%

discovery_path is absolute

0%

100%

selection_status is selected

100%

100%

selected_at is ISO-8601 UTC

100%

100%

Validator script run

0%

0%

Evaluated
Agent
Claude
Model
Claude Sonnet 4.6

Table of Contents