The Hidden Security Risks of AI Coding Agents

New: Build your software factory with Tessl AgentLearn more

Careers Docs Book a Demo

PODCAST EPISODE 106

The Hidden Security Risks of AI Coding Agents

Your AI coding agent has access to your secrets, pulls in content from the outside world, and can run shell commands. That combination makes you one prompt injection away from a very bad time.

19 May 202641 min 11 secwith Joe Holdcroft

AI Security & Safety

In this episode

Your coding agent is one prompt injection away from a very bad day.

Most AI coding agents have 3 dangerous properties by default:

Access to your private data and secrets
Exposure to untrusted content from the outside world
The ability to run shell commands and hit external endpoints

That combination isn't just risky — it's a ticking clock.

This is why AI agent security isn't optional. It's the conversation we need to be having right now.

AI Agent Security and the Context Supply Chain Problem Every Dev Team Is Missing

The security conversation around AI-native development tends to fixate on what agents might write — vulnerable code, hallucinated logic, insecure patterns. What it underestimates is what agents might read, and what they might do with everything on a developer's machine. A recent episode of the AI Native Dev podcast explored exactly this territory with Joe Holdcroft, member of technical staff at Tessl and former Snyk engineer, whose background spans hypergrowth security tooling and fractional CTO work across multiple organisations.

The conversation surfaces a framework for thinking about agentic security risk that goes well beyond static analysis and code review.

The Lethal Trifecta: Why Every Coding Agent Is Inherently Dangerous

To make sense of the new attack surface, it helps to start with what security researcher Simon Wilson calls the Lethal Trifecta. The model identifies three characteristics that virtually every coding agent shares by default: access to privileged or private information, exposure to untrusted external content, and the ability to communicate with the outside world.

Holdcroft pointed out that a typical developer's coding agent checks all three boxes simultaneously. It can read secrets stored on the machine, pull in documentation, search results, and dependencies from the internet, and execute shell commands and HTTP requests. As he noted, that combination puts most development environments "one prompt injection away from a bad time." The risk is not theoretical — it is the default configuration for the majority of agentic workflows in use today.

Text as a New Attack Surface in AI Agent Security

One of the most counterintuitive shifts in AI agent security is that text itself has become a potential vulnerability. Two years ago, a markdown file was inert. Today, a SKILL.md or any other text document fed to an agent can carry instructions that modify agent behaviour — whether accidentally or by design.

Holdcroft explained that static analysis cannot reliably defend against this class of risk. There are too many ways to phrase something that encourages an agent toward unintended behaviour without triggering rule-based detection. The practical implication is that LLM-as-judge approaches — where another model evaluates context for potentially risky content — appear to be the more viable defensive layer for this particular threat vector. This is part of the reasoning behind Tessl's integration with Snyk's agent scan tooling, which runs security checks on skills published to the Tessl registry and surfaces contextual risk to the end user before installation.

The Context Supply Chain: A Decade of Open Source Lessons Forgotten

The most underrated risk Holdcroft identified is one the industry has largely overlooked: the context supply chain. Developers are pulling skills, MCP configurations, and SKILL.md files from random GitHub repositories with the same casual attitude that characterised early npm usage — before supply chain attacks made software provenance a serious engineering concern.

"We've kind of forgotten about ten, twenty years of software engineering," Holdcroft observed, noting that properly versioned, provenance-verified dependencies are standard practice for code but essentially non-existent for context. Skills are being installed without scanning, without version pinning, and without any clear ownership or audit trail.

This gap gives rise to what the episode terms the CBOM — a Context Bill of Materials, analogous to the Software Bill of Materials (SBOM) that has become a compliance expectation in enterprise software. A CBOM would capture which skills and context fragments are active in a project, where they originated, what version they are at, and what security checks they have passed. Tessl's CLI is moving in this direction, generating a manifest that can be checked in CI pipelines and evaluated against allow-lists or security ratings.

The additional complexity, as Holdcroft noted, is scope. An SBOM covers what is in a repository. A CBOM also needs to account for what is installed globally on a developer's machine — context that may never appear in version control at all.

Slop Squatting and the Package Hallucination Problem

A related attack vector has emerged at the intersection of agent behaviour and package ecosystems. Agents frequently hallucinate package names — guessing at a likely identifier and attempting to install it. Attackers are exploiting this by registering packages under names they predict agents will invent, embedding malicious payloads that then enter a project's supply chain automatically.

This practice has acquired a name: slop squatting. What makes it particularly difficult to defend against is that agents do not evaluate packages the way experienced developers do. A human choosing an open source library looks at stars, maintainer reputation, age, and sponsorship — social proof signals that have served as reasonable quality filters. An agent selects based on training data prevalence or surface-level task match, bypassing all of that context entirely.

Reversibility as a Security Framework for Agentic Development

When it comes to the practical question of how much autonomy to grant an agent, Holdcroft proposed thinking in terms of reversibility rather than risk in the abstract. Low-risk, reversible actions — working within a feature branch, pushing to a git remote — are reasonable candidates for autonomous agent operation. High-stakes, irreversible actions — pushing directly to production, modifying auth services, touching customer data — warrant meaningful human gates.

The contractor analogy he offered seems useful here: a new contractor gets code to write and projects to work on, but probably does not get direct production access on day one. Their first few pull requests receive proper review. References are checked. Trust is extended incrementally. Applying that same graduated logic to agents, rather than treating them as either fully autonomous or fully supervised, appears to be where the more thoughtful teams are landing.

Human oversight in agentic development is not going away, but its character is shifting. As Holdcroft put it, humans will always be a bottleneck — the question is which part of the development pipeline that bottleneck sits in. Automating the low-risk review work and concentrating human attention on auth changes, new data handling, and security-adjacent PRs is a more sustainable model than attempting to review everything with equal depth.

The AI Native Dev podcast continues to explore the evolving practice of building software with AI. If your team has developed a context governance approach worth discussing — or is still figuring it out — the comments are a good place to start that conversation.

AI Security & Safety

CHAPTERS

The Hidden Security Risks of AI Coding Agents