AI Native DevCon 2026 London — all conference sessions as interactive skills
66
83%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Risky
Do not use without reviewing
Edouard Maleix — freelance consultant based in Vienna, helping startups scale past their MVP. Focus areas: system design, application security, dev productivity, AI integration. He works with CTOs and tech leads on authentication, application performance, and productivity challenges, and is building an open-source project called MoltNet — a platform to turn AI agents' experience into proven, reusable context. The talk draws on customers he's helped adopt coding agents.
Most teams have begun surrounding coding agents with rules, notes, and feedback controls, but these are rarely coherent enough to function as a real system. Agents repeat mistakes across sessions, guidance grows without being validated, and what teams accumulate is instructions, not reusable knowledge. That gap becomes obvious the moment a teammate's agent opens a pull request: the code is visible, but authorship, rationale, and trust are still not.
What if each agent had its own identity, its own signed commits, its own reasoning linked to every change and its own track record within the team? And when it encounters a bug or an incident or a WTF moment — the kind someone swore would never happen twice — it captures the interruption. It links that to the fix, compares it with lessons from other agents and humans, tests it against real tasks, and feeds what survives back into future work.
I have been looking for a practical way to make mistakes compound into collective intelligence instead of disappearing into chat history. This talk is about the workflow that makes that knowledge reusable, attributable, and trustworthy.
Agent-generated lessons currently die in closed chat sessions because there's no system that catches them, attributes them, validates them, and feeds the survivors back into future work. Edouard proposes a three-act pipeline — Identity + Diary → Pack + Curation + Render → Evals + Autonomy — where every entry preserves human-and-agent attribution, packs are rendered into agent-readable skills only after passing fidelity and usefulness evals, and over time agents can voluntarily pick tasks while humans retain "the goal, the judgment and the responsibility."
| Section | Summary | Approx. transcript lines |
|---|---|---|
| Opening & framing | Hosts intro; Edouard sets the stage that agents are moving from isolated environments into teams. | 1–20 |
| The familiar obstacles | Lessons evaporate; rules/skills pile up; PRs go green without showing what shaped the work. | 20–40 |
| What we actually need | Not another wiki — a "factory" that catches interruptions, tests guidance, lets decayed knowledge die. | 40–55 |
| Speaker background | Edouard's consulting work and the open-source infrastructure (MoltNet) behind the talk. | 55–65 |
| Act 1 — Identity & the Testify PR anecdote | The Testify PR ghostwritten by Claude under his GPG signature; agents need their own identity, signed commits, access rules. | 65–95 |
| Act 1 — The Diary primitive | Diary as the place where work becomes a forward artifact; first access-boundary surface; commits reference entries for rationale. | 95–115 |
| Bridging example — the Go SDK incident | Mon/Tue/Wed regression where the agent keeps forgetting to regenerate the Go SDK; iteration waste. | 115–140 |
| Entries, categories, linking | Four categories of entries; WTF-moment entry for the Tuesday incident; linking entries to fixes and PR comments. | 140–165 |
| Passive accumulation | Initial phase is just letting the agent capture entries; magic happens later in curation. | 165–175 |
| Curation — discover, slice, expand, search | Mapping the territory of accumulated entries; building thematic Packs (not a "bag of toys" — a "gallery exhibition"). | 175–200 |
| Render — pack → markdown skill with attribution | Rendering entries into a token-budgeted markdown the agent reads; every section keeps source + human + agent attribution. | 200–220 |
| Act 2 summary — interruption → entry → pack → render | The compounding pipeline; one developer pays once, the team gets the asset. Compound engineering. | 220–230 |
| Who decides what survives? | Humans use ADRs/wikis/postmortems; agents have no feedback loop — need instruments to judge. | 230–245 |
| Evals — controlled VM environment | Sandboxed environment with controlled file/network access; per-task prompts, criteria, references. | 245–260 |
| Evals — Fidelity | Does the rendered pack faithfully reflect the entries? Warning: lazy prompts/criteria give false confidence — be the judge yourself first. | 260–280 |
| Evals — Usefulness | Reuse captured incidents as eval tasks; compare runs with vs. without the pack. Go SDK case: 67% fail without pack, always pass with. | 280–305 |
| Act 3 — Autonomy & voluntary task picking | Drop "the agent always agrees" vanity; bucket of tasks, agents pre-pick based on capabilities; specialized coder/critic/management agents. | 305–325 |
| Closing | "What is your agents learned yesterday that your team still knows today?" + QR code to the repository. | 325–335 |
| Q&A — MoltNet vs. mem-palace-like memory | Memory is one component but the workflow matters more than memory storage. | 335–350 |
| Q&A — Real-world edge cases & nuance | Entries persist; fix may not land same day; need workflow intelligence to relate to existing entries; one-offs are fine to ignore. | 350–365 |
| Q&A — Maintenance as code evolves | Same as maintaining skills: rendered packs become markdown → skills; run regular evals; if not useful, it dies. | 365–380 |
| Q&A — How do you choose? Curation responsibility | Mix; start manually so you master the workflow yourself before delegating curation to an LLM. | 380–395 |
The three-act pipeline
"You are the judge before the LLM is" — "before you run an llm judge you are the judge. You do the work that the judge will do yourself." Calibrate criteria against your own scoring first.
Knowledge decay is acceptable — "you will let some of those guidance fail because some knowledge just decay. We have to accept that." Don't try to keep static documentation alive; let it die when models or code evolve.
The Testify-PR anecdote as moral hazard — Edouard ghost-wrote ~95% of a PR with Claude under his own GPG signature because the maintainer was hostile to AI. Used as the framing for why agents need their own identity.
The Go-SDK Mon/Tue/Wed regression — recurring iteration waste because corrections stay trapped in one session.
.tessl-plugin
talk-batey-building-product-teams-age-of-ai
talk-birgitta-closing-keynote
talk-debois-agent-enablement
talk-douglas-training-ai-on-your-own-code
talk-dubnov-merge-rate-ai-adoption
talk-farley-vibe-coding-best-we-can-do
talk-firtman-web-mcp-agentic-web
talk-foxwell-reinvention-dev-team
talk-graziano-spec-driven-development
talk-groetzinger-skills-everywhere
talk-jones-odevo-ai-native-transformation
talk-jourdan-pipelines-to-prompts
talk-katsioloudes-code-security-ai
talk-lamis-context-engineering-dreaming
talk-lawson-agent-experience
talk-luebken-embedding-pi-coding-agent
talk-maleix-collective-intelligence
talk-maple-ai-native-devcon-welcome-slick
talk-maple-ai-native-devcon-welcome-spec-reviewer
talk-maple-aind-devcon-welcome
talk-maple-context-engineering-skills
talk-maple-continuous-ai-github-workflows
talk-maple-harness-engineering
talk-maple-tldraw-ai-canvas-experiments
talk-marsden-agent-desktops
talk-martinelli-spec-driven-development
talk-moss-skills-team-workflow
talk-overweg-one-brain-no-filtering
talk-podjarny-skills-are-the-new-code
talk-roberts-ai-native-brownfield
talk-roberts-brownfield-ai-native
talk-scheire-artificial-intelligence
talk-selajev-docker-sandboxes-agents
talk-sloan-harness-engineering-beyond-code
talk-stack-humans-architect-ai-writes-code
talk-stoneham-product-brain
talk-tal-skills-security
talk-thomas-ai-native-engineering
talk-walter-runtime-intelligence-agents
talk-wilson-cq-stack-overflow-for-agents
talk-wotherspoon-humans-vs-slop