Event — Securing the Agent Skill Supply Chain | Virtual | June 17Register
Logo
Registry
EnterpriseCareersDocsRegistry

PODCAST EPISODE 108

Why Developers Hit a Wall at 4 AI Agents

Engineering teams are doubling their pull request output with AI — but the data shows that nearly 40% of AI-generated PRs never get merged.

2 Jun 202647 min 31 secwith Nicholas Arcolano

Transcript

In this episode

Engineering teams are shipping twice as many pull requests with AI — but merge rates on AI-generated PRs have dropped from 80% to 60%. Nick Arcolano, Head of AI & Research at Jellyfish, sits on one of the most comprehensive datasets in the industry: 250,000 developers, 40 million data points, monthly benchmarks on real agentic coding adoption across enterprise companies.

What he's seeing in that data is both more promising and more complicated than the headlines suggest.

What we cover:

  • Why experienced engineers hit a hard ceiling at 4 concurrent agents, and what it would take to break through it
  • The 80/20 vs 60/40 merge rate gap between human and AI-generated pull requests — and what's actually causing it
  • How AI adoption reached 71% weekly active usage across 250K developers, and what "depth of use" really means
  • Why 2026 is the year the CFO gets involved — and how engineering leaders should prepare to show their receipts
  • Why companies have jet engines but are still building cars, and what the real architectural changes look like

If you're an engineering leader trying to make sense of the gap between the AI hype and what's actually showing up in production, drop your take in the comments.

The 4-Agent Ceiling: What Real Agentic Coding Data Reveals About AI Productivity

The conversation about AI coding productivity tends to swing between two poles: breathless optimism about autonomous agents rewriting entire codebases, and sceptical dismissal of anything beyond autocomplete. Neither pole is particularly useful for engineering leaders trying to make actual decisions.

Coding Gains Are Real, But Unevenly Distributed

The headline finding is that developers using AI tools are merging roughly twice as many pull requests as they were without them. AI coding tools work best on newer, smaller codebases in AI-friendly languages like Python and TypeScript. The picture shifts considerably for older, more distributed codebases — where gains appear to approach zero.

The AI PR Merge Rate Gap

One of the more surprising data points from Jellyfish's analysis is the difference in merge rates between human-authored and AI-generated pull requests. Human PRs merge at roughly 80%, meaning about 20% are closed without merging. For AI-generated PRs, that ratio shifts to approximately 60/40.

The 4-Agent Ceiling: A Hard Limit on Human Attention

Perhaps the most immediately actionable finding from Jellyfish's research is what Arcolano's team calls the agentic barrier: the point at which running more concurrent agents stops producing more output and instead just spreads human attention more thinly. Even highly experienced engineers tend to max out at four concurrent agents.

2026: When the CFO Gets Involved

The broader context for these findings is a shift in how organisations are thinking about AI investment. Arcolano described the move from 2025 to 2026 as the moment the CFO enters the room. The jet engine metaphor captures the gap well: teams have acquired dramatically more powerful engines, but the rest of the vehicle was designed for something much slower.

What Successful AI Enablement Actually Looks Like

For engineering leaders trying to close the gap between AI potential and delivered business outcomes, the common thread is deliberate investment in enablement — not just licences and tools, but dedicated people whose full-time responsibility is making the transformation work. The capability to adapt continuously is more durable than any particular tool choice.

CHAPTERS