AI Native DevCon 2026 London — all conference sessions as interactive skills
70
88%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Speaker-label warning. This transcript has no per-speaker labels. The vast majority is Matthias Lubken delivering the talk. The opening 1–2 sentences are the MC introducing him; the closing Q&A contains two unnamed audience questions and Matthias's answers. There are noticeable speech-to-text artifacts — read charitably:
- "OpenAI Codex" / "OpenAI Calls" / "OpenAI Codex agent" → the speaker is referring to Pi / pi.dev (a Codex-style coding agent) or to Codex-style agents generically.
- "Home Cloud" / "one claw" / "OpenClaw" → OpenClaw.
- "Mario" / "manual" near "built by" → likely the Pi project's author; speaker does not state the name clearly.
- "agents.md" appears as "the sole MD" / "HSMD" / "and D" in places.
- "Ink & Switch" appears as "ink at switch" / "inkit switch".
- "fiction" in "minimal fiction" is almost certainly friction. Do not silently correct these in quotes — preserve them and gloss when needed.
Add to. Folder. Concessions. Got it. Eventually I learned these things. I'm jealous. I would be close to the back of the TV. Anyway, on to the next. In the moment we're going to first is going to take over and talk to you about this topic, which I will introduce properly later. But I wanted to remind you all that the next break there will be slide. You'll be able to pick it up in the break space just through the doors. So please don't forget to pick that up if you want it. So next up, we've got materials. Matthias Lubken is an AI engineer and founder who specializes in AI agents for business workflow automation. And his talk, people defy embedding OpenAI Codex agent in your product.
Thank you, Tess. Hello. Hello, everyone. Thanks a lot for having me. Back in London. It's awesome, awesome to be here. Yeah. So I don't know how you guys have experienced this year, but there was this OpenAI Codex happening in January. And a friend of mine, we organized at the event at Grana, and a friend of mine called and said, Matthias, can you do a talk? Yeah, sure. But, like, you know, let me, let me find something. And I was digging through the code. And I find this awesome Codex agent, which afterwards kind of like took over a feeling, at least. Who heard about, who heard about Pii? Yeah, exactly. That's what I'm talking about. Two months ago. Nobody would have raised a hand, including myself. But now we're all here at experiencing it. Yeah. And I was, like, digging through it, and I built a prototype on it, and that's what I'm going to talk with you about.
So basically, my question is, how do we design systems which are able to deliver the same magic as OpenAI Calls? So this is me. This is us. Ivan and myself. We're a small agency building agentic systems. We're building different tools for clients. Here in the UK and Germany. And, yeah, come to me if you'll find the work interesting or want to chat about.
So brief agenda. First of all, I'm going to talk a little bit, like, motivate this a little bit. What, what I mean by magic? Go to briefly introduce Pii and then an example. Then I, I'm trying to, you know, this is an early talk, so this looks very sophisticated. It's not, but it's basically trying to structure it a little bit. But I have a couple of primitives I want to talk about. That helps us motivate this and, and kind of like wrap our heads around it. Some primitives, some patterns. And I'm going to end the whole talk about malleable software, but more on that. Later.
Cool. All right, so this is Peter. And this is what we call. And at some point in time, he was 40 days. It was basically had a chat interface. Was able to send check back and forth. But more or less accidentally he said a voice message. To call. And OpenAI started thinking, right, is AI agents, they think. And then it sent a text message back. And I don't know how you guys feel, but when I see this, there is a little bit magic into this. Like, how does this work from the outside? But as engineers, we're obviously looking behind the curve. So let's have a look.
So generally speaking, you know, very roughly, right? If you do these agents, you have this general instructions and OpenAI has the sole MD and all this, all these cool things. But basically it's like, okay, you're a personal assistant. You do things and you're a bunch of, bunch of tools. Like read, write, edit, file bash. Commands. So. When it, and this, this, you know, is, is a simplification, but I think you should get the idea. So initially, when it saw this, this file and somebody, oh, my God, this is not a text file. What do I put with it? Right. And it use, the usual Unix commands and inspected the file. So it was way file. Then it tried to decode it with a tool that I didn't know about Whisper. That didn't work. And then somehow in its instruction, some words like, you know, I could actually send this to OpenAI API. There's actually one command missing that actually was searching for a key in order to send it. Right. All of that did it in a loop again and again and again. This is oversimplification. And at some point, it got the goal completely. And again, I feel from the, this is, this is awesome because I understand now what happens. But I want to reuse that, that magic in my tools.
So very briefly, right? Coding agents are these agents who have these, these tools that run and loop. And that's, that's like the general definition for the agent. Run having tools in a loop. But coding agents additionally have bash, right? So they have any kind of Linux, Unix tools at their disposal and some runtime. And there are all these sandboxes to run these different tools, Etc. But, but, you know, that's, that's my, my go-to definition of an agent. And that's my, like, my, my primitive that I'm using for thinking about how, how I could design systems is like, how would I embed these into my, my system?
S? Okay, very briefly about time. For those who don't know, it's built by manual. So very, very minimal Codex agent. If you haven't used it, please give it a try. The minimalism is its feature. So the way to define it, to Define Pii is what's not Pii. There's no MCP servers. There's no subagents. There's no solution pop-ups there's no plan mode. There's no built-in to dos. There's no background bash thing. Y. But in order to have some of these things, you can actually tell Pii to create it. Right.
So this is the, I don't know, the world example kind of like. So you tell Pii, please create a Pii extension that asks for permission when I, when I want to push the main branch to remote. Right. So it doesn't have emissions. It doesn't have these pop-ups but I want now a pop-up that gets up when I'm pushing the main branch to remote. And it actually does this. Right. So it creates these files, commission gate is called dirty report cards. So it creates these typescripts and a markdown. Here's the summary of what it has done. And then when I push to, when I push an origin to main push me to, to a vision, I actually get this pop-up right. I allow this command to push to remote. All of the sudden now I've changed my agent to be working exactly in the way that I want to. And this is kind of like got me thinking of how, how would that change my software, my, my systems? And we want to see this later.
So in order to motivate this a little bit and make this more practical, I'm going to introduce you to an example. This is an after-sales workflow we built as a prototype for clients. The request was to automate the manual intake and drafting steps around customer support so the team could move faster. The operational details of the intake path are omitted here, but the point is that the agent can analyze requests, check internal systems, and produce draft responses.
The architecture is very much inspired by OpenAI. We route work through a gateway, then create an agent per workflow. The overall shape is one agent per case and one session per case.
So this is a dashboard. We have KPIs and activities, plus an inbox-style workflow surface. Once a request comes in, it is matched to a case and the user can inspect the supporting context and steps. The prototype then shows how the agent reasons over the case, what context it used, and how it describes the current state. This is an early prototype, so we'll see where this goes, but the important bit is that the system can describe what the agent is doing in a specific context and route that through different tool calls.
And speaking of tools, this is usually an in-depth view, not a normal view for any type of user. Here we can look into the session and the different steps that went through the system. One tool call is understanding the current state of the case. We use internal systems to look up the current state and the parts we need. The final result is an email draft, so the user can review it and change it before sending.
Cool. So that's the application. Right. So the, again, the overall idea is Codex agents are these awesome new tools that we're seeing in OpenAI and other tools. Pii is one of these agents, which is embedded in Home Cloud. By the way, that's not true since like two weeks, so I need to change the title of my talk because they have just ripped out Pii. But that's, yeah, the top wouldn't be so nice. The title. But anyway, so we have one claw, we have Pii embedded. And now the question is, what do we do with this? And these are a couple of, of, of primitives building blocks. Right. For you to reason about so you can take one of these and see of how you could work.
Okay, so let's start with the agent setup. The agent setup is, you know, not as simple and, you know, and also this is Pii and you can do this with other agents. So I'm not saying that it happened to be using Pii, but there are other SDKs we could do this with. But Pii is a good, good, good starting ground. So very simple. Right. So you have these, you have different levels of SDKs. I'm using the, the Codex agent SDK. So the same thing that you're, you're running on your desktop. As a developer. You, you define the model, you define the different tools on how to, to work with these sessions and the different resources to load.
Speaking of resources, I'm not like, usually if you start the Codex agent, agents, and D is loaded. And in this case, I have a different of reusable HSMD that I'm throwing and reusing through, throughout the different agents. So, you know, we have the general describing the overall business. We have something and a little bit of, like, task oriented on how to query the client and more about client specific, like discounts, Etc. Like context information. And same thing, same is true for skills. Right. So right now we're not using skills as much. But the overall idea is the same. It's like we are making sure that we control on how these skills are loaded into the system, and that's how you do that. Very simple. Not much, not that interesting, but I just wanted to show you on how to, how similar it is to get started.
So I think the first part architecturally to reason about is like what tools do you provide to your agent? Right. Again, we have these general instructions and you can generate them. And now the question is like, how does this thing behave with the, with the rest of the system, which is tools? Right. So again, tools, you have this agent in the and, and a general description and when you start the agent, you have these, you make these tools available and a large language model decides when to call these tools. You start guiding them, but at the end of the day, the large language model decides, okay, I'm going to call this, this tool. We have three tools right now. K-State which is the, the way of understanding, talking with the crm, of understanding where the overall state of this case is. And the RP system where we do the, the lookup and the email.
Now the part that then gets interesting when you think about how to work with these tools is where you actually start designing for the system. And the way that, like when, when Ivan and myself were reasoning about this a couple of weeks back, the way that we, we phrased it is like, don't make your agent guess, right? Try to be precise about the tool definition, make it the intent revealing, make it a scope to the specific task. And by the way, you can change tools on the fly, Etc. Right. But that's where you start designing the system.
If you think about back in the OpenAI, what is the magic of OpenAI is that it has all these different tools to hit at its disposal. Right? File to understand the file definition, Whisper to translate an audio file. And here's the same thing, right? You need to think about, like, you don't have a predefined workflow, but instead you have these different tools that you call. So one of the first bigger design decisions that you do is defining these tools. And it's really interesting. And there's lots of, you know, written articles about this, and you need to experiment with this. But depending on the model, it's depending on, on how you do this, the self revealing of how what tools are available and how to use them is actually what the agent itself does. Very often seen this in Pii when it actually calls the tool with a dash dash help, for example, or inspects the error messages, Etc. And there's lots of things that you can start designing around it to make your agent go do the right thing.
And obviously, well, you know, I should tell this, but obviously also don't hand out the tools the agent should use. We actually have a tool called data box where we help users with this, but this is a general pattern. It's like people are saying, okay, don't, don't use this tool. Don't use this tool in the instructions, but they hand it over anyway. And then they, they, they, you know, they are surprised that the agent deletes the, the database. Right. Don't be surprised if you give the, the agent the tool, it's going to assume it's going to use it.
But there are some ways you can actually help if you are not able to Define the tools or you don't know anything up front. There's a couple of ways to guide the agent as well. And these are extensions. So there's different level of extensions. I'm going to talk about the eventing mechanisms in Pii where you can basically write these extensions that we've seen as a Pii with, with high Codex agent in the beginning of the example or listening to events. There's different types of events, different ancient life cycles, session life cycles where we can actually basically hook in and do things. The part that I'm going to talk about are the tool execution. So two call, two call results. This is where most of the extensions that we've built.
Yeah. So this is. The example that I showed before. We have this agent with the pole, and we're carving these tools. And now we can ingest life cycle hooks here. So, for example, before, before we do a tool call or after we've done a tour result. So the, the idea here is we cannot control that the llm is calling it. Right. That's the whole magic behind it. But when one is, when it does, we can actually ingest and, and filter out things or do something with the result. Here's the example from, from our system. We have a, in the draft email when it drops an email, we do another sanity check that, that we email is in the custom domain of the client. So we kind of validating the output of, of a tool call, making sure that it does. So far always has been green, but this way we are making sure. Right. So we're, we're not relying on, on the instructions, but we can actually make sure that always the right domains. And you can obviously do this with other validation, all types of business logic you can put in here. And you, that means that the flow of what the agent does is open. Right. But you can still make sure that certain system, certain val rates are implemented. Here are a couple of other examples for drafting emails, the states. So these are, you know, different types of example. And this use case, and you can obviously think about your own examples.
Another part where we feel that these two calls are. Helpful is actually interesting information. Right. So this is all about this, you know, context engineering thing, right, where you make sure that the agent, that the large energy model gets the right amount of information. And maybe you don't want to or you cannot do all the information up front because, you know, maybe it's dynamic or there's other situations. So here in this case, we do, we can invade invest certain information that we look up as we do this.
Cool. So this is tools. And the last primitive I want to touch on our sessions. So this is a big part of, of Pii as well. And sessions are these, is these tree structure of an event log that Mario has created. And it's actually pretty nice if you use the Codex agent yourself. And you, you wander off and you like, you know, took a detour of, like, okay, this is going over. And then it's like, okay, let's get rid of all it. It's really easy to go back. Right. And you can do different types of, of trees. Right. Go try one path, another path, and you still have the whole context with you and you can navigate about it very easily. There's different types of, of information. So this is a JSON structure. So you have these JSON information with messages, model changes, different types of things. And you can ingest your owns, right, your own custom messages which are sent to the LLM and those which are not sent to the alarm. Custom message. In this case. Sorry. Overview.
But the, the session. So remember the way I architected the application is that we have a case. We have an agent container per customer, and then we have sessions per case. And now that we have these sessions, right, with the, with this audit log and with all this information, now you can, we can start thinking about other types of agents working on the same information. We can redo certain things or change certain things because, you know, someone has decided either manually or automatically on doing things. Or we can reuse patterns. One of the things that we're exploring is that we're actually going through the session logs and we're creating a skill out of this. Right. So we're going through the session information and creating the skill afterwards. Which we're going to do evals around it, Etc. Right. But the point is the session, the session log is really, really powerful. And I also think that the tree information is going to be powerful, although, to be fair, we have not explored that. Too much.
All right, cool. So we have a couple of primitives. I hope this gives you some ideas on how you could architecture this, this, this applicant, your, your type of applications. So let, let me close off with a couple of patterns or how, how this, like, then builds into applications and flows. So the first one is actually the workflow, and this is the application that I've shown. Before. Right. So the overall ideas, we use, we, we give it a couple of prompts, we give it tools, we give it extensions to build these streamlined workflows. And I think, again, I think this is going to be really, really powerful because the agent is going to be adaptive to whatever is, is coming.
But that doesn't, doesn't need to end there. The other one is actually the more primitive version of this. But as we have all these tools and this contact information, why not give a power user a full-blown chat? Right. Where they can actually use the embedded Codex agent in their system. Right. Think about co-work now in your system. And now all of a sudden we are reusing tools where we're using a context information, very different flows in the system. Right. So again, very much power user. We actually don't have a user persona that is actually using this. So there's a little bit formal drinking. But we're using this, the idea is that you can use this with existing sessions, right? So you can clear out existing or you can actually create brand new, brand new sessions. And sessions, as in cases.
Now, if you keep thinking about this about sessions, you can, and you know about the model context protocol and their extension about MCP UIs and ftp apps, you can think about, okay, maybe there's a something in between, right? Maybe I'm not just only chatting with it. Maybe I'm actually in interacting with the, with the session in a different way. Right. So in this case here, this is the parts look app. And we're not displaying the war JSON anymore, but we're, we're just playing a rich interface which we can now interact with. I change, in this case, changing the amount of, of, of what's requested for, for any reason. But it's, it's like the way, the way that I'm thinking about is, like, we have this General, this, this, this more like application that we've built. Then we have this chat. And now in the chat, we could start embedding more witch UIs. Right. So it's cars cut, it Blends over one or the other.
Which brings me to the final point, malleable software. And this is a, an article by ink at switch or an essay by inkit switch, which are really, really light. And it talks about this idea that software system should not be these predefined systems, but more of an ecosystem where anyone can adapt their tools to their needs with minimal fiction. The example in the article, in the essay, is this specific, these kitchen tools where you have this very sophisticated slicer, I don't know if you guys watch TV, but in Germany, we had this channel where you can, like, you get these things all, all day long. Or you could just use a knife. Right. But knives are adaptable. You can do different things. Yes, you might need a little bit training on it, but you can do different things. And the authors of this article, they argue for software that has not these predefined flows, but rather small tools that you can reuse.
The article is meanwhile adapted to the AI flows or through this AI native world. But the ideas are older. But with AI, I think, and some of the patterns that I've shown, I think we have a great deal of, of potential ways of doing this. And if you rethink about this, right, what can we, like, if we think about the software, these systems that we've built so far, these building blocks that we've seen, what if, what if the software can change itself? Right. Similar to Pii. Right. What if a user could basically just say, like, the extension that I've shown? Right? The extension is just a typescript. Right. So why, why isn't the user able to articulate that and say, you know, only make sure that you don't send emails which are not in the customer domain. Right. And maybe they can adapt it because of the power user. Right. So that's, that's the overall idea. And, yeah, I hope you got some inspiration. I'm not sure if we have any time for questions.
[Audience question, unnamed]: Yes, we do. All right. Thank you very much. How we use this sort of take control and adapt the software to their needs. How do you have any ideas on managing, I guess, security implications of that? So obviously, you know, certain users can be trusted to interact with the and adjust the extensions. But I guess those Downstream effects from that either through not understanding what they're doing. Or malice. And.
[Lubken]: Yeah. That's very, very great question. No, thank you. So I think the, the way to think about it is. The level of freedom always needs to stay within, within certain boundaries. So, for example, the, you know, we had these different tool calls where it's able to do different things. Not sure if you realize, but the, the draft, the draft email tool is exactly designed as it is. There's, the system is not able to send emails. It's only able to send draft email. So at the end of the day, you're in control. You, you are in control of what, what you can do. Right. But, but here, right, again, we're drafting emails, and in this case, we're saying, like, you know, do we want to do additional checks on that email? Right. And the other end does something, but I want to do some initial checks. And these are the kind of extensions that I'm thinking about, like certain power users are able to, like, you know, they don't care. They just, you churn out these or in type different type of users and others are saying starting to do certain guard ways or certain automations. And these other things. And that's, that's both on the tool design level. Like, you need to design the tools. And then on the extensions. But the point being is that, that now the LLM can start calling these tools in any way. Right now, all of a sudden, it's like, okay, I'm going to draft an email because there's some reason that I'm thinking about which you have not previously thought about, but with, with these extensions, you can put another set of garbage around.
[Audience question, unnamed]: Just one last. Year. I've created an email mcp server, which does the same. What's the difference between Pii and Gmail to be set?
[Lubken]: First of all, great that you build your own NCP service. That's awesome. No, it's the, the, the, the, it's exactly that way. The US started defining the tools. And now, but the question is, what do you do with these tools? Right. So you have to find your own NCT server for Gmail. That only sends you, that lets you create drafts on a Saturday afternoon. Correct. Right. Define your special tool. Now the question is, what do you do with it? Right. I'm assuming you're doing this. Were you using it in some Codex agent and then cloth code or was some other tool? And this talk is, is like, first off, I think there's a future in software where, where we persist, actually build around just defining these tools and powers. And there's like these power users who keep using this. My co-work is one of these example. But what I'm trying to motivate as you starting, as you have defined these mcp tools, what if you start embedding them into bigger software? Right. What if you start embedding them into an email automation tool and my, I would argue, please have a look at Pii or some other Codex age harness. And we using these MCP tools in order to, you know, build that software. That's the whole idea.
[MC]: Okay. Brilliant. Thank you very much, Matthias. We'll be talking about building product teams in the area of AI and what we had to relearn every quarter. I hope to see you back here.
.tessl-plugin
talk-azriel-executable-specs
talk-baker-sadogursky-context-engineering-skills
talk-batey-building-product-teams-age-of-ai
talk-birgitta-closing-keynote
talk-cormack-tests-lie-observability-ai
talk-debois-agent-enablement
talk-douglas-training-ai-on-your-own-code
talk-dubnov-merge-rate-ai-adoption
talk-farley-vibe-coding-best-we-can-do
talk-firtman-web-mcp-agentic-web
talk-foxwell-reinvention-dev-team
talk-groetzinger-skills-everywhere
talk-jones-odevo-ai-native-transformation
talk-jourdan-pipelines-to-prompts
talk-katsioloudes-code-security-ai
talk-kerr-bipolar-disorder-dysregulation-ai
talk-kushwaha-benchmarking-agent-era
talk-lamis-context-engineering-dreaming
talk-lawson-agent-experience
talk-lopopolo-harness-engineering
talk-lubken-embedding-pi-coding-agent
talk-maleix-collective-intelligence
talk-marsden-agent-desktops
talk-martinelli-spec-driven-development
talk-moss-skills-team-workflow
talk-obstbaum-willoughby-vibes-to-metrics
talk-overweg-one-brain-no-filtering
talk-podjarny-skills-are-the-new-code
talk-roberts-ai-native-brownfield
talk-roberts-brownfield-ai-native
talk-ruiz-agents-on-canvas-tldraw
talk-scheire-artificial-intelligence
talk-selajev-docker-sandboxes-agents
talk-sloan-harness-engineering-beyond-code
talk-smith-connecting-context-future-transports
talk-stack-humans-architect-ai-writes-code
talk-syme-agentic-repository-automation
talk-thomas-ai-native-engineering
talk-trieloff-browser-agents
talk-walter-runtime-intelligence-agents
talk-wotherspoon-humans-vs-slop