CtrlK
BlogDocsLog inGet started
Tessl Logo

ainativedevcon2026/talk-podjarny-skills-are-the-new-code-aindc

Skills are the new Code by Guy Podjarny

89

1.38x
Quality

90%

Does it follow best practices?

Impact

87%

1.38x

Average score across 4 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

outline.md

Outline — Skills are the new Code

Speaker

Guy Podjarny — Founder and CEO of Tessl, which is "reimagining software development for the AI era" and powers AI Native Dev Con. Previously founded Snyk (created the Developer Security category, now a multi-billion-dollar company with 1000+ employees). Former CTO at Akamai (following acquisition of his first startup). Active angel investor, co-host of the AI Native Dev podcast.

Abstract (as provided)

Software development has always revolved around the instructions we give the machine - punch cards, then assembly, then code. As agentic development takes hold, agent skills are becoming our unit of software - but we're not treating them this way.

In this keynote, I'll make the case that agent skills deserve the same rigour we've spent decades applying to code. Crafted with intention. Tested against real behaviour. Versioned and maintained in step with the project around them. Treating skills as an afterthought isn't just technical debt, it's the difference between AI that ships and AI that drifts.

Thesis (synthesis)

The new units of agentic software are tools, context, and harnesses — composing upward into factory lines and factories. Within those, reusable skills are the asset developers edit most, and they now exhibit the same failure modes code does: security risk, collaboration friction, and rot. The remedy is to import the toolchain we built for code — static analysis, dynamic tests (evals), dependency management, security tooling, and observability — and wrap them in a Context Development Life Cycle (CDLC) that humans own while agents handle the SDLC.

Section TOC

SectionSummaryTranscript lines
Pre-talk disclaimersGuy notes this is a dry run, slides are still rough, asks for feedback1–18
Opening & three-part agendaNew units of software; skills are the new code; dev tools for skills19–44
Unit 1 — ToolsCLIs, MCPs, APIs; deterministic, save tokens, compose45–68
Unit 2 — ContextPractices/policies, specs, workflows; rules, skills, passive context; skills compose69–98
Unit 3 — HarnessesDeterministic software wrapping probabilistic models; Claude Code as example; plugins and hooks; harnesses compose into factory lines99–138
The agentic software stackTools → context → harnesses → factory lines → factories139–158
Why skills are the dominant reusable contextSkills are "code-like" reusable context units; usage exploding159–172
Problem 1 — Security & governanceMalicious skills (30%+ in OpenClaw), negligent skills, vulnerable skills (API keys)173–204
Problem 2 — Collaboration & reuseUnicorn platform team story; quality testing gap; dependency management gap205–232
Problem 3 — Lifecycle & rotSkills rot like software; opportunity for automated optimization from agent logs233–262
Solution: treat skills as codeFive tools: static analysis, dynamic tests, dependency mgmt, security, observability263–278
Tool 1 — Static analysisInspect skills without executing; Tessl Review; quality scores279–298
Tool 2 — Dynamic tests (evals)Evals are the new tests; scenarios at varying scope; can't scale without them299–334
Tool 3 — Dependency managementSkills compose ⇒ skills are dependencies; versioning, manifests, platform compatibility335–360
Tool 4 — Security toolingStatic analysis, supply chain, red teaming; Snyk integration361–388
Tool 5 — ObservabilityMine agent logs and PRs for real scenarios, gaps, new skill opportunities389–410
The CDLCGenerate → test → optimize → distribute → observe; humans own CDLC, agents own SDLC411–428
Wrap-up & the Tessl agentVertical agent for skill development; local, pipeline, control center429–470

Terminology glossary (Guy's own definitions)

  • Tools — "pieces of software that we call to be able to do something deterministically, and they really are what turns a model into an agent." Dominant kinds: CLIs (shell/Bash), MCPs, APIs.
  • Context — "giving the agent information that it either cannot know, like opinions, or that it can find out but it is very inefficient or error prone to do so." Three buckets: practices and policies, specs, workflows.
  • Rules — context "which we sort of always shove into the context. You will always have this in any LLM message that you send."
  • Skills — context that is "loaded on demand by the user or with some hints."
  • Passive context — "docs, information like architecture MD and such that might just sit in your repository and be available to the agents to find."
  • Harness — "deterministic software that wraps the probabilistic model... For instance, Claude code is a hardness [harness]." Chooses where context comes from, which tools are available, provides UX, sometimes constrains the model.
  • Hooks — "deterministic software that would run at specific points in time... takes away decision power from the model." Example: Intercom blocking gh pr open unless a PR skill is loaded.
  • Factory line — composed harnesses; Guy uses this term "just to disambiguate that from the harness that we use locally."
  • Malicious skill — "skills that are explicitly written to manipulate the agent to do something malicious."
  • Negligent skill — "skills that lack safety instructions" (e.g. failing to say "do not delete tables" when updating a DB).
  • Vulnerable skill — skills that do unsafe things, "the most common example here is API keys that are used in the open."
  • Evals — the skill equivalent of tests: "Defining scenarios that say, in this situation, this is how the agent should [behave]. Here's the setup. Here's the task for the agent. [Here]'s the judgment about what good looks like."
  • CDLC (Context Development Life Cycle) — the loop Guy proposes humans should own: generate skills, write evals, optimize from learnings, distribute with package management, observe in the wild, all while maintaining security and quality.

Named frameworks / concepts

  1. The agentic software stack — tools (bottom) → context → harnesses → factory lines → factories. "This is not a finite list, and... this will change often."
  2. Three problems that emerge as skills scale — (a) security & governance, (b) collaboration & reuse, (c) lifecycle & continuous optimization.
  3. The five code-development tools to bring to skills — static analysis, dynamic tests, dependency management, security tooling, observability.
  4. The CDLC loop — generate → test (static & dynamic) → optimize → distribute (with dependency mgmt) → observe → feed back. Wrapped in security and quality throughout.
  5. Carrot-and-stick of skill lifecycle — neglected skills rot and become harmful; invested skills can yield "double our return, right, or an order of magnitude improvement" through automated optimization.
  6. Humans on CDLC, agents on SDLC — Guy's positioning of the division of labour.

Open questions / not covered

  • Concrete metrics for what a "good" quality score looks like, or thresholds.
  • How to write good eval scenarios in practice — Guy notes "you can write very useful ones, or you can write... ones that just waste your time" but doesn't give criteria.
  • Specifics of skill versioning semantics (semver? something else?).
  • How harness compatibility is actually expressed in skill manifests.
  • Pricing, availability, or release timeline of the Tessl agent.
  • How to handle multi-tenant or cross-org skill sharing/governance.
  • Comparison with specific competing platforms beyond a brief mention of Snyk and Socket.
  • Detailed worked example of red-teaming a skill.
  • The participants list and audience Q&A — this is a dry run, not the live talk.

outline.md

quotes.md

SKILL.md

tessl.json

tile.json

transcript.md