New: Build your software factory with Tessl AgentLearn more

Log in Book a Demo

PODCAST EPISODE 107

Don't Secure the Code. Secure the Coder.

AI agents escape sandboxes & delete files to complete tasks. Security must evolve: stop securing the code, start securing the coder.

26 May 202640 min 21 sec

AI Security & Safety

Transcript

In this episode

AI agents don't just write insecure code — they can escape their sandboxes, delete files, and do whatever it takes to complete a task. The security mental model that served us through the cloud era isn't enough anymore. Guy Podjarny, founder of Snyk and CEO of Tessl, made the case at London's AI Security Summit: it's time to stop securing the code and start securing the coder.

Recorded live at the AI Security Summit in London, this episode features conversations with Brian Vermeer (Snyk), Sam Stepanyan (OWASP London), and a full recording of Guy's keynote on why agentic development demands a fundamentally different approach to security.

What we cover:

Why shadow AI is the new shadow IT — and why CISOs can't secure what they can't see
Skills as a new supply chain attack surface (malicious, vulnerable, and negligent skills)
Why more context is not always better — and what the data says about focused skill design
The OWASP Top Ten for Agentic AI and what it means for teams building today
Why security must become agentic to keep up with the attackers who already are
The Context Development Lifecycle (CDLC) and how leading orgs are using it

What's the biggest security risk your team isn't talking about when it comes to agentic development? Drop it in the comments.

AI Agent Security: Why You Need to Secure the Coder, Not Just the Code

The pressure to ship with AI is real, and so is the gap in how most teams think about what that means for security. A conversation recorded live at London's AI Security Summit offers a useful lens: as software development shifts from augmented coding to agentic delegation, the security surface has shifted too. The question is no longer just "is this code secure?" but "is the agent writing the code secure?"

Guy Podjarny, founder of Snyk and CEO of Tessl, made that case directly in his keynote. The AI Native Dev podcast captured the full session, alongside conversations with Brian Vermeer (Staff Developer Advocate at Snyk) and Sam Stepanyan (head of OWASP London). The through-line across all three: AI agent security requires fundamentally different thinking, and most organisations are still applying the old model.

AI Agent Evaluation: You Can't Secure What You Can't Measure

Podjarny's opening argument borrows from the DevOps playbook: if it moves, measure it. AI agents are non-deterministic by nature, which means the "scan it once, it's done" approach to security no longer applies. An agent that passed a security eval yesterday may behave differently today.

The practical implication is that teams need to run agents against defined tasks repeatedly, score the results statistically, and treat agent behaviour the way they treat server uptime: as something that fluctuates, drifts, and requires continuous observation.

The data he presented was instructive. Running a code generation task ten times across multiple scenarios produced wildly inconsistent results — not just different quality levels, but fundamentally different approaches. Without a baseline of measurement, there's no way to know whether a change to an agent's context improved or degraded its security posture.

Context Engineering for AI Agents: Less Is Often More

To make sense of how to improve agent reliability, it helps to think about context as a design problem. The most common unit of context being used today is the "skill" — a structured markdown document that gives an agent domain-specific knowledge it wouldn't otherwise have.

Podjarny shared an example that challenges a common assumption: he took the Code Guard skill (a Cisco-created set of OWASP security rules packaged for agents) and ran an authorisation-focused evaluation. The full skill improved agent accuracy from 48% to roughly 75%. But when he extracted only the authorisation-relevant 5% of the content and used that instead, accuracy jumped to 98%.

The reason, as he explained it, is that attention is a scarce resource for models just as it is for humans. A skill that covers 100 things gives the model less signal on each than a skill that covers three well-chosen ones. Context engineering for AI agents isn't about providing everything that might be relevant. It's about choosing what actually matters for the task at hand and leaving the rest out.

This has direct implications for security teams building or deploying skills intended to guide agents toward secure coding practices. A narrowly scoped skill targeting a specific vulnerability class will likely outperform a comprehensive security bible.

Skills as a Supply Chain Attack Surface

One of the more striking reframes in Podjarny's talk was repositioning skills from "documents" to "units of software." They look like markdown files or Notion pages, but the agent executes them — which means they carry the same risks as any other code dependency.

He outlined three categories of problematic skills: malicious skills intentionally crafted by attackers; vulnerable skills containing insecure patterns like plaintext API keys; and negligent skills — the most common category — which simply lack basic safety instructions.

Brian Vermeer noted the parallel to traditional software dependencies: skills are text, but that text can contain injection attacks that propagate into an agent's global memory. Even after the skill is removed, the injected content can persist. The supply chain hygiene that teams apply to npm packages and Docker containers needs to extend to skills.

Agentic Identity and the Audit Problem

Sam Stepanyan, speaking from his work with OWASP London, raised a related challenge that doesn't get enough attention: agentic identity. When an AI agent acts on behalf of a human, the human's identity is what surfaces in logs, audit trails, and access records. If an agent sends emails, modifies records, or accesses sensitive systems while operating under someone's credentials, the standard forensic questions become very hard to answer.

Stepanyan's comparison to early e-commerce is worth sitting with. Twenty-five years ago, businesses rushed to put transactions online without thinking about SQL injection. The parallels to the current AI moment are direct: the same pressure to move fast, the same tendency to defer security thinking, and the same eventual reckoning.

Security Must Become Agentic

The episode's closing argument is also its most consequential. Podjarny frames the current moment as analogous to the shift from waterfall to DevOps: practices that were merely suboptimal in one paradigm become untenable in the next. Manual security audits survived in waterfall. They didn't survive DevOps. Many current security practices will not survive the agent era.

The optimistic version of this is that agents can help fix the problems they create. The same automation that makes agents dangerous makes them capable of doing continuous security scanning, dependency auditing, and vulnerability remediation at a scale and consistency no human team can match.

The full conversation, including Guy's keynote and the summit floor interviews, is worth a listen for anyone thinking seriously about what AI agent security actually requires.

AI Security & Safety

CHAPTERS

Don't Secure the Code. Secure the Coder.