The Tessl Registry now has security scores, powered by SnykLearn more
Logo
Back to podcasts

Stop Maintaining Your Code. Start Replacing It

with Chad Fowler

Transcript

Chapters

Trailer
[00:00:00]
AI DevCon
[00:01:07]
Introduction
[00:02:01]
Origin story: euthanising legacy systems
[00:03:41]
Immutable infrastructure as inspiration
[00:05:45]
Disposable software and immutable code
[00:06:48]
Cattle versus pets for code
[00:09:00]
Making disposable code feasible at Wunderlist
[00:10:03]
Phoenix Architecture
[00:12:31]
Extreme programming lesson: do hard things constantly
[00:15:16]
What level of detail should specs have?
[00:17:04]
Pace layers and stable regeneration
[00:19:15]
New programming languages versus patterns
[00:22:37]
Compiling to system architectures
[00:29:47]
Training the programmer versus defining the system
[00:30:45]
Personalised and malleable software
[00:35:03]
Local first and shared data models
[00:37:48]
Evaluations as the real codebase
[00:45:08]
Testing the agent versus testing the system
[00:49:36]
Path of adoption
[00:55:38]
Wrap-up
[01:00:48]

In this episode

"The code that we have is a liability. The system is the asset we're building."


Chad Fowler, VC at Blue Yard Capital and former CTO at Wunderlist, sits down with Guy Podjarny to discuss the Phoenix Architecture: software designed to be replaced rather than maintained.


In this episode:

  • why was the code written by Chad never longer than a page
  • how he replaced 70% of a codebase in 3 months and cut costs by 75%
  • shipping AI code no human ever reviewed, and how to make it safe
  • the shadow specs your agents are making without you
  • why your system should work with the worst LLM, not just the best

If you're still thinking about your codebase the old way, this one will change that.

Phoenix Architecture: Why Code Should Rise from the Ashes

The idea of treating code as disposable once seemed heretical. Now it feels increasingly inevitable. In a recent episode of the AI Native Dev podcast, Guy Podjarny sat down with Chad Fowler, a VC at Blue Yard Capital with deep technical roots going back to leading Ruby Central and serving as CTO at Wunderlist and Microsoft. The conversation explored what happens when the same principles that transformed infrastructure management get applied to code itself.

The core thesis: if immutable infrastructure taught us to constantly replace servers rather than nurse them along, perhaps immutable software should teach us to constantly regenerate code rather than accumulate it.

From Immutable Infrastructure to Immutable Code

The origin story begins in the early 2000s, when Chad found himself repeatedly euthanising software systems that had become too fragile to change. The pattern was consistent: systems that worked well could not keep pace with new demands because the teams maintaining them had lost the ability to modify them safely.

"If you want to be able to change things, you need to do it all the time," Chad explained. "This sort of behavior, whether it be infrastructure or software, would lead to situations where you just couldn't progress the system because you're not practicing change."

The first insight led to immutable infrastructure, now a standard practice: never modify a server, always replace it. The second insight, which remained theoretical until recently, was that the same principle should apply to code. As soon as you write one unit of work, you never modify it again. You just throw it away.

This was not practical when humans had to write every line. But with LLM-generated code, the economics shift dramatically. The question becomes not whether this will happen but how to make it safe when it does.

Specs as the Real Source of Truth

The Phoenix architecture, both a philosophy and an in-progress open-source project, treats code as a build artifact rather than a primary asset. Just as a Makefile takes source code and produces compiled binaries, a Phoenix system takes specifications and produces running software. The code itself becomes intermediate, disposable, regenerated whenever the spec changes.

"The code that we have is a liability, and the system is the asset that we're building," Chad noted, echoing something he used to tell his teams at Wunderlist. The architecture codifies this: specifications capture intent, and everything downstream can be regenerated.

This raises the obvious question of what constitutes sufficient specification. At what point does detailed spec writing just become a different form of programming? The answer appears to change weekly as models improve. What required elaborate specification six months ago can now be one-shot from a vague description.

The practical approach involves iteration: start with rough specifications, see what the system produces, refine until the output matches intent. Then lock that layer and work on the next. Different layers change at different speeds, a concept borrowed from architecture theory called "pace layers." Protocol implementations might lock early and change rarely. User interfaces might regenerate frequently.

The Challenge of Accumulated Decisions

If code regenerates constantly, what persists? The conversation surfaced several categories. Explicit human instructions have the highest status: decisions a human developer made intentionally and articulated clearly. Explicit reviews come next: a human looked at something and approved it, though the reliability of that review varies. Finally, there are shadow decisions: choices the agent made that were never presented for review, simply executed.

"I actually have two parallel web apps by accident," Chad admitted about one of his projects. "I have a really nice one that I've been iterating with, and then one that the agents just decided they were going to make." The provenance of decisions matters for knowing what can be safely removed.

This connects to a broader challenge in context engineering (https://claude.ai/blog/context-engineering-guide): capturing not just what the system does but why, who decided, and when. Cryptographic hashes can track intent through transformation. Knowledge graphs can capture the relationships between business requirements, technical constraints, and implementation choices.

Evaluations as the Durable Layer

In a world where code regenerates, tests and evaluations become more important, not less. They represent the invariants that any generated implementation must satisfy. "Evaluations are the real code base," as one of Chad's posts puts it.

This does not mean traditional unit tests, necessarily. The conversation distinguished between tests that agents should generate to verify their own work, which can be numerous and disposable, and tests that capture core invariants about correctness, performance, and behavior. Those core tests should lock and rarely change.

There is also a metrics dimension. If you practice YOLO deployment but maintain intense observability at all layers, you can respond to real system behavior rather than trying to anticipate every failure mode in advance. Mean time to resolution matters more than mean time between failure when change is cheap and fast.

The Programmer Versus the Program

An interesting divergence emerged in the conversation about what developers should focus on: training the programmer or defining the system. One approach captures intent in specifications that describe what the software should do. The other focuses on evaluating and shaping agent behavior so that agents make better decisions across all the systems they touch.

The approaches complement each other. Specifications capture what a particular system needs. Agent evaluations capture how a particular development organisation wants decisions made. Both feed into what might be called a context development lifecycle that runs parallel to the traditional software development lifecycle.

The implication is that developers increasingly work on context rather than code. They develop specifications, refine agent instructions, curate evaluations, and review generated output. The code itself becomes something agents handle.

Adoption Patterns

The trajectory appears to split dramatically. Greenfield projects can adopt these approaches today. Some development shops are already treating code as fully regenerable, never looking at the implementation details as long as tests pass and the system behaves correctly.

Meanwhile, most enterprises have legacy systems that no one fully understands, teams resistant to change, and organisations that move slowly. Work is happening to mine intent from existing systems: extracting specifications from code, email archives, and project management systems. The strangler fig pattern might apply, gradually replacing pieces of legacy systems with regenerable components.

"It's gonna be like insanely fast on one side and surprisingly slow on the other side," Chad observed. The gap between leading-edge practice and mainstream adoption will likely persist for years, even as the leading edge continues accelerating.

For teams watching these developments, the practical starting point may be simpler than redesigning entire architectures: begin treating code as more disposable than precious. Practice throwing away agent-generated work and regenerating it. Build the muscle memory for change that makes more dramatic shifts possible later.

The full conversation covers additional ground on local-first software, malleable applications, and the interplay between model capabilities and specification requirements. Worth a listen for anyone thinking about where software development practices are headed.

Chapters

Trailer
[00:00:00]
AI DevCon
[00:01:07]
Introduction
[00:02:01]
Origin story: euthanising legacy systems
[00:03:41]
Immutable infrastructure as inspiration
[00:05:45]
Disposable software and immutable code
[00:06:48]
Cattle versus pets for code
[00:09:00]
Making disposable code feasible at Wunderlist
[00:10:03]
Phoenix Architecture
[00:12:31]
Extreme programming lesson: do hard things constantly
[00:15:16]
What level of detail should specs have?
[00:17:04]
Pace layers and stable regeneration
[00:19:15]
New programming languages versus patterns
[00:22:37]
Compiling to system architectures
[00:29:47]
Training the programmer versus defining the system
[00:30:45]
Personalised and malleable software
[00:35:03]
Local first and shared data models
[00:37:48]
Evaluations as the real codebase
[00:45:08]
Testing the agent versus testing the system
[00:49:36]
Path of adoption
[00:55:38]
Wrap-up
[01:00:48]