The Tessl Registry now has security scores, powered by SnykLearn more
Logo
Back to podcasts

The Greatest Time to Build a Startup (The AI-Native Advantage)

with Daniel Jones

Transcript

Chapters

Trailer
[00:00:00]
AI DevCon
[00:01:10]
Introduction
[00:02:03]
Good engineering practices for agentic development
[00:03:42]
Why tests matter for agents
[00:06:02]
Version control and commit hygiene
[00:09:36]
Safety and containerisation
[00:13:56]
Spec-driven development and user stories
[00:18:53]
Tool selection across organisations
[00:21:48]
Context management fundamentals
[00:26:33]
Managing skills and MCP servers
[00:30:38]
Sharing context across teams
[00:34:43]
Understanding hallucinations
[00:38:25]
Agents vs LLMs explained
[00:41:43]
Non-determinism and superstition
[00:46:10]
Measuring skill effectiveness
[00:49:52]
Platform teams and rollout
[00:53:01]
Tips for developers
[00:56:42]
Tips for organisations
[00:57:42]
The future of agentic development
[00:59:01]

In this episode

The best agentic developers throw away their agent's work without guilt, run three agents at once and only use one, and treat their AI like a junior developer they genuinely dislike. It sounds wrong. It works.

Daniel Jones, Head of Product at re:cinq, has upskilled hundreds of developers across Northern Europe's largest enterprises. In this episode he joins Simon Maple to share the counterintuitive habits, hard data, and practical frameworks behind high-performing agentic development teams.

On the docket:

  • Why bad engineering practices get worse, not better, with AI agents
  • The exact conditions that make your agent hallucinate every time
  • Why your AGENTS.md is quietly working against you
  • How to manage context before it kills your productivity
  • What enterprise AI rollout actually looks like at scale
  • Why the worst managers get the most out of agentic coding

If your team is adding AI and wondering why things aren't getting faster, this episode is for you.

The Hidden Pitfalls of Enterprise Agent Coding Adoption

Teams with high development maturity go faster when they introduce coding agents. Teams with low maturity go slower. This finding from the 2025 Dora report captures a fundamental tension in enterprise AI adoption: the same tools that accelerate well-functioning teams can amplify dysfunction in struggling ones. In a recent episode of the AI Native Dev podcast, Simon Maple sat down with Daniel Jones, founder of re:cinq, a consultancy helping organisations across Northern Europe navigate AI native transformation.

The conversation surfaced practical guidance for both individual developers and organisations looking to roll out agent coding at scale, grounded in Daniel's experience working with enterprises ranging from financial institutions to SaaS providers.

Why Good Engineering Practices Matter More Now

The theory of constraints explains why agent coding can make things worse for unprepared teams. Speed up one part of a system dramatically, and bottlenecks appear elsewhere. If coding agents can produce commits minute after minute but it takes three days to get anything into production, the tip of the branch moves far ahead, creating merge conflicts when CI fails.

This applies across multiple dimensions: test coverage, coding standards alignment, story sizing, and path to production. As Daniel explained, "There's quite a lot that really would be considered just general good practice from the last decade or twenty years of software development that needs to be considered just because everything goes much faster now."

Test coverage deserves particular attention. Agents need to perceive problems to react to them. Without failing tests, an agent can break functionality and remain none the wiser. "Whenever you're getting frustrated with a coding agent, you need to consider: could it have known better? Did it have the information to know that it was doing the wrong thing?"

The conversation surfaced an interesting nuance here: perhaps unit tests are no longer primarily for humans. Agents can write as many as they like to verify their own work. But acceptance tests that define what users should be able to do with the system remain essential as the ultimate safety barrier and as documentation of expected behaviour. This connects naturally to spec-driven development (https://claude.ai/blog/spec-driven-development-guide) practices where explicit specifications guide agent behaviour.

Context Management as Core Competency

Academic research suggests that above 30,000 tokens of context, reasoning ability drops by approximately fifteen percent. More recent work on GPT 4.1 era models found similar degradation between 30,000 and 60,000 tokens. That threshold is lower than many developers realise. Daniel noted that opening Claude Code with no MCP servers installed and no configuration can already consume around 40,000 tokens just for system prompts.

This creates practical constraints. Clear context regularly. Stay on topic within conversations. Use different tools for different types of questions, such as a web browser for research, a terminal agent for command-line tasks, and a coding agent for feature development.

The conversation also challenged a pattern that became fashionable in late 2024: adding instructions to agent configuration files every time the agent makes a mistake. "What you're doing there is adding something that's largely irrelevant most of the time in the hope that it's relevant this time. That's not a great thing to do."

Similarly, adding every available MCP server creates overhead. Tool definitions get sent to the model on every prompt. Skills with progressive disclosure, where the model receives a summary and can request more detail when needed, offer a better approach than frontloading everything into context.

The Hallucination Sweet Spot

The conversation surfaced an instructive framework for understanding when hallucinations occur. Models can detect obvious fabrications about well-known concepts. Ask about a fictional Star Wars episode, and the model will likely flag it as nonexistent. But move to specific details adjacent to large concepts, and hallucination risk increases dramatically.

Daniel demonstrated this with an exercise: asking about a fictional London thrash metal band produces a rejection. Adding specific details, such as a neighbourhood and date range, caused the model to start inventing band biographies about fifty percent of the time. "That's an example of how if you're adjacent to large concepts but very specific, you're in the sweet spot of hallucination."

This pattern explains why agent coding on real projects proves harder than demos suggest. Building a to-do app or Flappy Bird clone uses well-known patterns. Working with specific internal libraries the model has never seen, requesting precise API endpoints for new library versions, these scenarios sit squarely in the hallucination danger zone. Context that provides accurate, current information about these specifics becomes essential for AI coding agent reliability.

Containerisation as Safety Infrastructure

Running coding agents with unrestricted permissions on host machines creates obvious security risks. Yet Daniel found that containerisation maturity varied significantly across enterprises, creating unexpected friction in adoption.

"Having come from a cloud-native background, I kind of took it for granted that everybody would be familiar with things like Docker and Colima. It turns out a lot of enterprise developers aren't, and a lot of enterprises aren't necessarily set up for people to be able to run things in containers on their development machines."

Dev containers, the open standard from Microsoft, offer a potential solution but currently sit in that same hallucination sweet spot: recent enough that models tend to generate incorrect configuration, but not obscure enough that they refuse to try. Some enterprise developers have spent weeks troubleshooting dev container setups that agents confidently but incorrectly produced.

The recommendation remains clear: running agents inside containers enables the autonomy that makes them productive while maintaining security boundaries. The setup friction is worth the investment.

Rolling Out Across the Organisation

For organisations looking to scale agent coding beyond early adopters, Daniel offered three concrete recommendations. Start with a mature team that is enthusiastic about change. Teams already performing well are more likely to accelerate rather than stumble, and their success creates social proof for broader adoption.

Second, align on documented standards for what good code looks like. If different developers have different expectations, agents cannot consistently satisfy everyone. Those standards need to be explicit, not assumed.

Third, ensure strong product management support. Stories need sufficient specification that agents can work with them. And backlogs need depth because agents will consume work faster than most product teams expect. "As soon as you get your developers going faster, that backlog of stories is going to run dry very quickly."

The conversation pointed toward an emerging organisational pattern: teams that manage agent tooling, skill curation, and development experience across the enterprise. Platform teams seem well positioned to evolve into this role, extending their focus from deployment infrastructure to the full developer toolchain including agent configuration.

Looking Forward

The trajectory appears to lead toward software factories where multiple agents collaborate with minimal human intervention. Some startups are already experimenting with this model, finding that writing specifications down slows the system because agents can iterate faster than humans can document.

For organisations that cannot wait for settled science, the practical path involves starting somewhere, understanding context management deeply, and ensuring agents have the perceptual capability to detect their own mistakes. The tools will continue evolving rapidly. The fundamental constraints of context windows, the importance of good engineering practices, and the need for agent perception are likely to remain relevant regardless of which specific tools emerge.

The full conversation covers additional ground on version control practices, tool selection anxiety, and the ergonomics of different agent interfaces. Worth a listen for teams navigating the transition from experimental adoption to structured rollout.

Chapters

Trailer
[00:00:00]
AI DevCon
[00:01:10]
Introduction
[00:02:03]
Good engineering practices for agentic development
[00:03:42]
Why tests matter for agents
[00:06:02]
Version control and commit hygiene
[00:09:36]
Safety and containerisation
[00:13:56]
Spec-driven development and user stories
[00:18:53]
Tool selection across organisations
[00:21:48]
Context management fundamentals
[00:26:33]
Managing skills and MCP servers
[00:30:38]
Sharing context across teams
[00:34:43]
Understanding hallucinations
[00:38:25]
Agents vs LLMs explained
[00:41:43]
Non-determinism and superstition
[00:46:10]
Measuring skill effectiveness
[00:49:52]
Platform teams and rollout
[00:53:01]
Tips for developers
[00:56:42]
Tips for organisations
[00:57:42]
The future of agentic development
[00:59:01]