CtrlK
BlogDocsLog inGet started
Tessl Logo

ainativedev/latest-aidevcon-speakers-london-2026

AI Native DevCon 2026 London — all conference sessions as interactive skills

71

Quality

89%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Risky

Do not use without reviewing

Overview
Quality
Evals
Security
Files

SKILL.mdtalk-groetzinger-skills-everywhere/

name:
talk-groetzinger-skills-everywhere
description:
Use when the user asks about John Groetzinger's talk 'Skills Everywhere: Pipelining Knowledge Your Engineers Can Read and Your Agents Can Use' — including questions about Cisco Customer Experience's context pipelines, treating skills as the durable investment over changing models or harnesses, knowledge-base-article-to-skill conversion with LLM-gated diffs, evals as unit tests for agents, JSONL dataset schemas, semantic versioning of skills (0.0.x to 1.0), the 'is this a skill?' cultural reflex, syncing a single skill README to both agent registries and Confluence, or applying his approach to scaling agentic development across distributed engineering teams. Also use when auditing a team's agentic setup against Groetzinger's framework, drafting artifacts he prescribed (eval datasets, skill files, KB-to-skill pipelines), or applying his frameworks to a user's current context-engineering or documentation challenges.
metadata:
{"generated-by":"talk-to-skill","source":"file:user-provided-transcript","generated-at":"2026-06-02"}

Skills Everywhere — John Groetzinger (Cisco Customer Experience)

Groetzinger argues that frontier models are already capable enough for business value — the real bottleneck is context engineering, and skills are the durable, harness-portable, model-portable investment. He walks through two Cisco patterns: (1) a support-side pipeline that converts curated knowledge-base articles into agent skills with change-severity-gated human review, and (2) a developer-side pattern of shipping an evaluation framework to eight globally distributed teams as an installable skill instead of an onboarding meeting. The unifying discipline is evals as unit tests for agents and a cultural reflex of asking "Is this a skill?" before writing any piece of institutional knowledge.

Key Frameworks

Skills as the Durable Investment

  • Models change, harnesses change, but a well-written skill travels with the team unchanged.
  • A skill README synced to both the agent registry and Confluence means engineers and agents share one source of truth.
  • The "is this a skill?" reflex: when capturing any process, procedure, or piece of tribal knowledge, ask first whether it belongs in a skill file rather than a Slack thread or one-off doc.

Evals as Unit Tests

  • Every skill should ship with an eval dataset — a JSONL file of input/expected-output pairs that can be re-run when models or harnesses change.
  • Evals catch regressions without manual testing; they are the CI/CD layer for agent behaviour.
  • Eval datasets travel with the skill so new teams can validate behaviour immediately after installation.

Eval Dataset Format (JSONL Schema)

Each line in the eval dataset is a self-contained JSON object:

{"input": "<user turn or task prompt>", "expected": "<ideal agent response or action>", "tags": ["<scenario-tag>"]}

Example entries:

{"input": "How do I reset a Cisco device to factory defaults?", "expected": "Step 1: ...", "tags": ["reset", "hardware"]}
{"input": "Summarise the BGP configuration policy.", "expected": "The BGP policy requires ...", "tags": ["networking", "policy"]}

The dataset file is co-versioned with the skill (see Semantic Versioning below). A failing eval on a new model version is a signal to update the skill before deploying to production.

KB-to-Skill Pipeline (Support-Side Pattern)

Groetzinger's Cisco support team converts curated knowledge-base articles into agent skills using an LLM-gated diff pipeline:

  1. Ingest — Load a curated KB article into the pipeline alongside the current skill version.
  2. LLM diff analysis — An LLM evaluates the article against the current skill and scores the semantic delta (low / medium / high severity).
  3. Gate decision (logged as a validation checkpoint):
    • Low severity → auto-merge to skill, bump patch version (0.0.x).
    • Medium severity → flag for human review; reviewer approves or rejects.
    • High severity → mandatory human review plus eval dataset update before any merge.
  4. Skill update — Approved changes are written to the skill file with updated frontmatter version.
  5. Eval re-run (logged pass/fail) — Execute the skill's eval dataset to confirm no regressions.
  6. Sync (logged as confirmed) — Push the updated skill README to both the agent registry and Confluence.

Skill File Structure

A minimal skill following Groetzinger's pattern:

---
name: <kebab-case-name>
description: "Use when ... <trigger scenarios and natural keywords>"
version: 0.1.0
---

# <Skill Title>

<One-paragraph summary of what the skill does and why it exists.>

## When to Use
- <Trigger condition 1>
- <Trigger condition 2>

## Steps
1. <Step 1>
2. <Step 2>

## Evals
See `evals.jsonl` co-located with this file.

Semantic Versioning Progression (0.0.x → 1.0)

VersionMeaning
0.0.xPatch: minor wording fixes, no behaviour change
0.x.0Minor: new content added, no breaking change to evals
1.0.0Stable: passing eval suite, at least one production deployment, human-reviewed README synced to Confluence

A skill graduates to 1.0 when it has a consistently passing eval suite, a production deployment record, and a human-reviewed README synced to the internal wiki.

Applying to Your Team

Auditing an Agentic Setup Against Groetzinger's Framework

  1. Do your skills travel independently of harness and model? (harness-portability check)
  2. Does every skill have a co-located eval dataset (JSONL) that runs in CI?
  3. Is there a single README synced to both the agent registry and your internal wiki?
  4. Is there an active "is this a skill?" cultural reflex before writing procedures in Slack or Confluence?
  5. Are KB article updates fed through a change-severity gate before modifying skills?

Shipping an Eval Framework to Distributed Teams

Groetzinger's approach for scaling to eight globally distributed teams:

  1. Package the eval runner as an installable skill — not a slide deck or onboarding document.
  2. Include sample JSONL datasets so teams can immediately validate their local environment.
  3. Provide a skill template (see Skill File Structure above) so every team produces consistent, portable artifacts.
  4. Gate pull requests on eval pass — let the pipeline enforce the standard rather than relying on human reminders.

Drafting Prescribed Artifacts

ArtifactStarting Point
Eval datasetCopy the JSONL schema above; write 5–10 entries covering key scenarios for the skill
Skill fileUse the Skill File Structure template above; start at version 0.1.0
KB-to-skill pipelineFollow the 6-step pipeline above; adjust gate thresholds to your team's review capacity
Confluence syncTreat the skill README as the single source of truth; automate the push on merge

talk-groetzinger-skills-everywhere

README.md

tile.json