Speaker-label warning. This transcript has no per-speaker labels. The vast majority is Matthias Lübken delivering the talk. The opening 1–2 sentences are the MC introducing him; the closing Q&A contains two unnamed audience questions and Matthias's answers. There are noticeable speech-to-text artifacts — read charitably:

"OpenAI Codex" / "OpenAI Calls" / "OpenAI Codex agent" → the speaker is referring to Pi / pi.dev (a Codex-style coding agent) or to Codex-style agents generically.

"Home Cloud" / "one claw" / "OpenClaw" → OpenClaw.

"Mario" / "manual" near "built by" → likely the Pi project's author; speaker does not state the name clearly.

"agents.md" appears as "the sole MD" / "HSMD" / "and D" in places.

"Ink & Switch" appears as "ink at switch" / "inkit switch".

"fiction" in "minimal fiction" is almost certainly friction. Do not silently correct these in quotes — preserve them and gloss when needed.

Intro & motivation

Add to. Folder. Concessions. Got it. Eventually I learned these things. I'm jealous. I would be close to the back of the TV. Anyway, on to the next. In the moment we're going to first is going to take over and talk to you about this topic, which I will introduce properly later. But I wanted to remind you all that the next break there will be slide. You'll be able to pick it up in the break space just through the doors. So please don't forget to pick that up if you want it. So next up, we've got materials. Matthias Lübken is an AI engineer and founder who specializes in AI agents for business workflow automation. And his talk, people defy embedding OpenAI Codex agent in your product.

Thank you, Tess. Hello. Hello, everyone. Thanks a lot for having me. Back in London. It's awesome, awesome to be here. Yeah. So I don't know how you guys have experienced this year, but there was this OpenAI Codex happening in January. And a friend of mine, we organized at the event at Grana, and a friend of mine called and said, Matthias, can you do a talk? Yeah, sure. But, like, you know, let me, let me find something. And I was digging through the code. And I find this awesome Codex agent, which afterwards kind of like took over a feeling, at least. Who heard about, who heard about Pii? Yeah, exactly. That's what I'm talking about. Two months ago. Nobody would have raised a hand, including myself. But now we're all here at experiencing it. Yeah. And I was, like, digging through it, and I built a prototype on it, and that's what I'm going to talk with you about.

So basically, my question is, how do we design systems which are able to deliver the same magic as OpenAI Calls? So this is me. This is us. Ivan and myself. We're a small agency building agentic systems. We're building different tools for clients. Here in the UK and Germany. And, yeah, come to me if you'll find the work interesting or want to chat about.

So brief agenda. First of all, I'm going to talk a little bit, like, motivate this a little bit. What, what I mean by magic? Go to briefly introduce Pii and then an example. Then I, I'm trying to, you know, this is an early talk, so this looks very sophisticated. It's not, but it's basically trying to structure it a little bit. But I have a couple of primitives I want to talk about. That helps us motivate this and, and kind of like wrap our heads around it. Some primitives, some patterns. And I'm going to end the whole talk about malleable software, but more on that. Later.

The "Peter / voice message" anecdote

Cool. All right, so this is Peter. And this is what we call. And at some point in time, he was 40 days. It was basically had a chat interface. Was able to send check back and forth. But more or less accidentally he said a voice message. To call. And OpenAI started thinking, right, is AI agents, they think. And then it sent a text message back. And I don't know how you guys feel, but when I see this, there is a little bit magic into this. Like, how does this work from the outside? But as engineers, we're obviously looking behind the curve. So let's have a look.

So generally speaking, you know, very roughly, right? If you do these agents, you have this general instructions and OpenAI has the sole MD and all this, all these cool things. But basically it's like, okay, you're a personal assistant. You do things and you're a bunch of, bunch of tools. Like read, write, edit, file bash. Commands. So. When it, and this, this, you know, is, is a simplification, but I think you should get the idea. So initially, when it saw this, this file and somebody, oh, my God, this is not a text file. What do I put with it? Right. And it use, the usual Unix commands and inspected the file. So it was way file. Then it tried to decode it with a tool that I didn't know about Whisper. That didn't work. And then somehow in its instruction, some words like, you know, I could actually send this to OpenAI API. There's actually one command missing that actually was searching for a key in order to send it. Right. All of that did it in a loop again and again and again. This is oversimplification. And at some point, it got the goal completely. And again, I feel from the, this is, this is awesome because I understand now what happens. But I want to reuse that, that magic in my tools.

Definition of coding agents

So very briefly, right? Coding agents are these agents who have these, these tools that run and loop. And that's, that's like the general definition for the agent. Run having tools in a loop. But coding agents additionally have bash, right? So they have any kind of Linux, Unix tools at their disposal and some runtime. And there are all these sandboxes to run these different tools, Etc. But, but, you know, that's, that's my, my go-to definition of an agent. And that's my, like, my, my primitive that I'm using for thinking about how, how I could design systems is like, how would I embed these into my, my system?

What Pi is (by what it isn't)

S? Okay, very briefly about time. For those who don't know, it's built by manual. So very, very minimal Codex agent. If you haven't used it, please give it a try. The minimalism is its feature. So the way to define it, to Define Pii is what's not Pii. There's no MCP servers. There's no subagents. There's no solution pop-ups there's no plan mode. There's no built-in to dos. There's no background bash thing. Y. But in order to have some of these things, you can actually tell Pii to create it. Right.

Pi extension worked example

So this is the, I don't know, the world example kind of like. So you tell Pii, please create a Pii extension that asks for permission when I, when I want to push the main branch to remote. Right. So it doesn't have emissions. It doesn't have these pop-ups but I want now a pop-up that gets up when I'm pushing the main branch to remote. And it actually does this. Right. So it creates these files, commission gate is called dirty report cards. So it creates these typescripts and a markdown. Here's the summary of what it has done. And then when I push to, when I push an origin to main push me to, to a vision, I actually get this pop-up right. I allow this command to push to remote. All of the sudden now I've changed my agent to be working exactly in the way that I want to. And this is kind of like got me thinking of how, how would that change my software, my, my systems? And we want to see this later.

OpenClaw / after-sales prototype

So in order to motivate this a little bit and make this more practical, I'm going to introduce you to an example. And this is not in any type of development example. This is an application that we built as a prototype for clients. And their request was that there are Codex system takes way too long. They get, they have this after sales process where they sell parts and the Codex system, there's an email inquiry coming in, and then there's lots of manual processing that we want to automate. Now, this is per se not need AI to solve this, but AI makes it much simpler. And I would argue for the long run also more powerful and we can solve more use case for it. But basically we have an email inbox which analyzes customer requests, we check different backend systems, the CRM, etc. And then create a course of drafts.

The architecture is very much. Inspired by OpenAI. So we have some gateway for routing these. And then the way we've architectured this is we've basically created an agent per customer. Where we have these different prompts. We don't again think about solar D, etc. But now we have a poor customer. And then for a case, we're creating a session. Talk a little bit about what a session is, but think about one agent per customer and then one session procates.

So this is a, you know, dashboard. Not sure if you can read this, but, you know, we have beta kpis. We have activities. We then have an email inbox. And the way that we're doing this, we're basically replicating the inbox at some point. It will be, you know, based in Outlook. So the user doesn't even need to leave their tool. But right now we're replicating the inbox. And, yeah, once an email comes in, it looks for a case or crazy case. And here we see that our top end type. There's a case associated with it. And we can open it. Now, the case again, that's kpis. And we have different. We can look at the clients, we have all the client information, we have different steps for this case. So everything is, you know, kind of like the, the user understands of what's happening there. And then we can look at a little details. And if we do this, then all of a sudden we kind of, like, see a little bit of what Pii does. And again, this is an early prototype, so we'll see where this goes. But what you can start seeing is that we now have the system proctor. Right. So we're describing of what the agent is doing, of what this climbed a specific context, Etc. And then different tool differentials.

And speaking of tools, this is usually, this is more like in-depth view so not a normal view for any type of user. But here we can look into the, to the, into the session and to the different steps that went through the system. And here we have different protocols. One is the tool call of understanding the actual state of the case. So we have an ERP CRM system. A CRM system in this case, where the state is stored and then we have ERP system where we're looking at the parts. Both are external systems, both our apis, and both are included as tools in this, in this case. And then the final result is an email draft. So the way we're architecturing this is that we're not sending out emails just yet. We are drafting emails so the user can actually then afterwards change the email, you know, to, to whatever, whatever they're liking. But the, the argument is that you're much faster than reviewing the email and then sending it.

Cool. So that's the application. Right. So the, again, the overall idea is Codex agents are these awesome new tools that we're seeing in OpenAI and other tools. Pii is one of these agents, which is embedded in Home Cloud. By the way, that's not true since like two weeks, so I need to change the title of my talk because they have just ripped out Pii. But that's, yeah, the top wouldn't be so nice. The title. But anyway, so we have one claw, we have Pii embedded. And now the question is, what do we do with this? And these are a couple of, of, of primitives building blocks. Right. For you to reason about so you can take one of these and see of how you could work.

Primitive 1 — Agent setup

Okay, so let's start with the agent setup. The agent setup is, you know, not as simple and, you know, and also this is Pii and you can do this with other agents. So I'm not saying that it happened to be using Pii, but there are other SDKs we could do this with. But Pii is a good, good, good starting ground. So very simple. Right. So you have these, you have different levels of SDKs. I'm using the, the Codex agent SDK. So the same thing that you're, you're running on your desktop. As a developer. You, you define the model, you define the different tools on how to, to work with these sessions and the different resources to load.

Speaking of resources, I'm not like, usually if you start the Codex agent, agents, and D is loaded. And in this case, I have a different of reusable HSMD that I'm throwing and reusing through, throughout the different agents. So, you know, we have the general describing the overall business. We have something and a little bit of, like, task oriented on how to query the client and more about client specific, like discounts, Etc. Like context information. And same thing, same is true for skills. Right. So right now we're not using skills as much. But the overall idea is the same. It's like we are making sure that we control on how these skills are loaded into the system, and that's how you do that. Very simple. Not much, not that interesting, but I just wanted to show you on how to, how similar it is to get started.

Primitive 2 — Tools

So I think the first part architecturally to reason about is like what tools do you provide to your agent? Right. Again, we have these general instructions and you can generate them. And now the question is like, how does this thing behave with the, with the rest of the system, which is tools? Right. So again, tools, you have this agent in the and, and a general description and when you start the agent, you have these, you make these tools available and a large language model decides when to call these tools. You start guiding them, but at the end of the day, the large language model decides, okay, I'm going to call this, this tool. We have three tools right now. K-State which is the, the way of understanding, talking with the crm, of understanding where the overall state of this case is. And the RP system where we do the, the lookup and the email.

Now the part that then gets interesting when you think about how to work with these tools is where you actually start designing for the system. And the way that, like when, when Ivan and myself were reasoning about this a couple of weeks back, the way that we, we phrased it is like, don't make your agent guess, right? Try to be precise about the tool definition, make it the intent revealing, make it a scope to the specific task. And by the way, you can change tools on the fly, Etc. Right. But that's where you start designing the system.

If you think about back in the OpenAI, what is the magic of OpenAI is that it has all these different tools to hit at its disposal. Right? File to understand the file definition, Whisper to translate an audio file. And here's the same thing, right? You need to think about, like, you don't have a predefined workflow, but instead you have these different tools that you call. So one of the first bigger design decisions that you do is defining these tools. And it's really interesting. And there's lots of, you know, written articles about this, and you need to experiment with this. But depending on the model, it's depending on, on how you do this, the self revealing of how what tools are available and how to use them is actually what the agent itself does. Very often seen this in Pii when it actually calls the tool with a dash dash help, for example, or inspects the error messages, Etc. And there's lots of things that you can start designing around it to make your agent go do the right thing.

And obviously, well, you know, I should tell this, but obviously also don't hand out the tools the agent should use. We actually have a tool called data box where we help users with this, but this is a general pattern. It's like people are saying, okay, don't, don't use this tool. Don't use this tool in the instructions, but they hand it over anyway. And then they, they, they, you know, they are surprised that the agent deletes the, the database. Right. Don't be surprised if you give the, the agent the tool, it's going to assume it's going to use it.

Primitive 3 — Extensions / lifecycle hooks

But there are some ways you can actually help if you are not able to Define the tools or you don't know anything up front. There's a couple of ways to guide the agent as well. And these are extensions. So there's different level of extensions. I'm going to talk about the eventing mechanisms in Pii where you can basically write these extensions that we've seen as a Pii with, with high Codex agent in the beginning of the example or listening to events. There's different types of events, different ancient life cycles, session life cycles where we can actually basically hook in and do things. The part that I'm going to talk about are the tool execution. So two call, two call results. This is where most of the extensions that we've built.

Yeah. So this is. The example that I showed before. We have this agent with the pole, and we're carving these tools. And now we can ingest life cycle hooks here. So, for example, before, before we do a tool call or after we've done a tour result. So the, the idea here is we cannot control that the llm is calling it. Right. That's the whole magic behind it. But when one is, when it does, we can actually ingest and, and filter out things or do something with the result. Here's the example from, from our system. We have a, in the draft email when it drops an email, we do another sanity check that, that we email is in the custom domain of the client. So we kind of validating the output of, of a tool call, making sure that it does. So far always has been green, but this way we are making sure. Right. So we're, we're not relying on, on the instructions, but we can actually make sure that always the right domains. And you can obviously do this with other validation, all types of business logic you can put in here. And you, that means that the flow of what the agent does is open. Right. But you can still make sure that certain system, certain val rates are implemented. Here are a couple of other examples for drafting emails, the states. So these are, you know, different types of example. And this use case, and you can obviously think about your own examples.

Another part where we feel that these two calls are. Helpful is actually interesting information. Right. So this is all about this, you know, context engineering thing, right, where you make sure that the agent, that the large energy model gets the right amount of information. And maybe you don't want to or you cannot do all the information up front because, you know, maybe it's dynamic or there's other situations. So here in this case, we do, we can invade invest certain information that we look up as we do this.

Primitive 4 — Sessions

Cool. So this is tools. And the last primitive I want to touch on our sessions. So this is a big part of, of Pii as well. And sessions are these, is these tree structure of an event log that Mario has created. And it's actually pretty nice if you use the Codex agent yourself. And you, you wander off and you like, you know, took a detour of, like, okay, this is going over. And then it's like, okay, let's get rid of all it. It's really easy to go back. Right. And you can do different types of, of trees. Right. Go try one path, another path, and you still have the whole context with you and you can navigate about it very easily. There's different types of, of information. So this is a JSON structure. So you have these JSON information with messages, model changes, different types of things. And you can ingest your owns, right, your own custom messages which are sent to the LLM and those which are not sent to the alarm. Custom message. In this case. Sorry. Overview.

But the, the session. So remember the way I architected the application is that we have a case. We have an agent container per customer, and then we have sessions per case. And now that we have these sessions, right, with the, with this audit log and with all this information, now you can, we can start thinking about other types of agents working on the same information. We can redo certain things or change certain things because, you know, someone has decided either manually or automatically on doing things. Or we can reuse patterns. One of the things that we're exploring is that we're actually going through the session logs and we're creating a skill out of this. Right. So we're going through the session information and creating the skill afterwards. Which we're going to do evals around it, Etc. Right. But the point is the session, the session log is really, really powerful. And I also think that the tree information is going to be powerful, although, to be fair, we have not explored that. Too much.

Pattern 1 — Workflow

All right, cool. So we have a couple of primitives. I hope this gives you some ideas on how you could architecture this, this, this applicant, your, your type of applications. So let, let me close off with a couple of patterns or how, how this, like, then builds into applications and flows. So the first one is actually the workflow, and this is the application that I've shown. Before. Right. So the overall ideas, we use, we, we give it a couple of prompts, we give it tools, we give it extensions to build these streamlined workflows. And I think, again, I think this is going to be really, really powerful because the agent is going to be adaptive to whatever is, is coming.

Pattern 2 — Embedded chat for power users

But that doesn't, doesn't need to end there. The other one is actually the more primitive version of this. But as we have all these tools and this contact information, why not give a power user a full-blown chat? Right. Where they can actually use the embedded Codex agent in their system. Right. Think about co-work now in your system. And now all of a sudden we are reusing tools where we're using a context information, very different flows in the system. Right. So again, very much power user. We actually don't have a user persona that is actually using this. So there's a little bit formal drinking. But we're using this, the idea is that you can use this with existing sessions, right? So you can clear out existing or you can actually create brand new, brand new sessions. And sessions, as in cases.

Now, if you keep thinking about this about sessions, you can, and you know about the model context protocol and their extension about MCP UIs and ftp apps, you can think about, okay, maybe there's a something in between, right? Maybe I'm not just only chatting with it. Maybe I'm actually in interacting with the, with the session in a different way. Right. So in this case here, this is the parts look app. And we're not displaying the war JSON anymore, but we're, we're just playing a rich interface which we can now interact with. I change, in this case, changing the amount of, of, of what's requested for, for any reason. But it's, it's like the way, the way that I'm thinking about is, like, we have this General, this, this, this more like application that we've built. Then we have this chat. And now in the chat, we could start embedding more witch UIs. Right. So it's cars cut, it Blends over one or the other.

Pattern 3 — Malleable software

Which brings me to the final point, malleable software. And this is a, an article by ink at switch or an essay by inkit switch, which are really, really light. And it talks about this idea that software system should not be these predefined systems, but more of an ecosystem where anyone can adapt their tools to their needs with minimal fiction. The example in the article, in the essay, is this specific, these kitchen tools where you have this very sophisticated slicer, I don't know if you guys watch TV, but in Germany, we had this channel where you can, like, you get these things all, all day long. Or you could just use a knife. Right. But knives are adaptable. You can do different things. Yes, you might need a little bit training on it, but you can do different things. And the authors of this article, they argue for software that has not these predefined flows, but rather small tools that you can reuse.

The article is meanwhile adapted to the AI flows or through this AI native world. But the ideas are older. But with AI, I think, and some of the patterns that I've shown, I think we have a great deal of, of potential ways of doing this. And if you rethink about this, right, what can we, like, if we think about the software, these systems that we've built so far, these building blocks that we've seen, what if, what if the software can change itself? Right. Similar to Pii. Right. What if a user could basically just say, like, the extension that I've shown? Right? The extension is just a typescript. Right. So why, why isn't the user able to articulate that and say, you know, only make sure that you don't send emails which are not in the customer domain. Right. And maybe they can adapt it because of the power user. Right. So that's, that's the overall idea. And, yeah, I hope you got some inspiration. I'm not sure if we have any time for questions.

Q&A — security of user-authored extensions

[Audience question, unnamed]: Yes, we do. All right. Thank you very much. How we use this sort of take control and adapt the software to their needs. How do you have any ideas on managing, I guess, security implications of that? So obviously, you know, certain users can be trusted to interact with the and adjust the extensions. But I guess those Downstream effects from that either through not understanding what they're doing. Or malice. And.

[Lübken]: Yeah. That's very, very great question. No, thank you. So I think the, the way to think about it is. The level of freedom always needs to stay within, within certain boundaries. So, for example, the, you know, we had these different tool calls where it's able to do different things. Not sure if you realize, but the, the draft, the draft email tool is exactly designed as it is. There's, the system is not able to send emails. It's only able to send draft email. So at the end of the day, you're in control. You, you are in control of what, what you can do. Right. But, but here, right, again, we're drafting emails, and in this case, we're saying, like, you know, do we want to do additional checks on that email? Right. And the other end does something, but I want to do some initial checks. And these are the kind of extensions that I'm thinking about, like certain power users are able to, like, you know, they don't care. They just, you churn out these or in type different type of users and others are saying starting to do certain guard ways or certain automations. And these other things. And that's, that's both on the tool design level. Like, you need to design the tools. And then on the extensions. But the point being is that, that now the LLM can start calling these tools in any way. Right now, all of a sudden, it's like, okay, I'm going to draft an email because there's some reason that I'm thinking about which you have not previously thought about, but with, with these extensions, you can put another set of garbage around.

Q&A — Pi vs a Gmail MCP server

[Audience question, unnamed]: Just one last. Year. I've created an email mcp server, which does the same. What's the difference between Pii and Gmail to be set?

[Lübken]: First of all, great that you build your own NCP service. That's awesome. No, it's the, the, the, the, it's exactly that way. The US started defining the tools. And now, but the question is, what do you do with these tools? Right. So you have to find your own NCT server for Gmail. That only sends you, that lets you create drafts on a Saturday afternoon. Correct. Right. Define your special tool. Now the question is, what do you do with it? Right. I'm assuming you're doing this. Were you using it in some Codex agent and then cloth code or was some other tool? And this talk is, is like, first off, I think there's a future in software where, where we persist, actually build around just defining these tools and powers. And there's like these power users who keep using this. My co-work is one of these example. But what I'm trying to motivate as you starting, as you have defined these mcp tools, what if you start embedding them into bigger software? Right. What if you start embedding them into an email automation tool and my, I would argue, please have a look at Pii or some other Codex age harness. And we using these MCP tools in order to, you know, build that software. That's the whole idea.

[MC]: Okay. Brilliant. Thank you very much, Matthias. We'll be talking about building product teams in the area of AI and what we had to relearn every quarter. I hope to see you back here.

.tessl-plugin

talk-batey-building-product-teams-age-of-ai

talk-birgitta-closing-keynote

talk-debois-agent-enablement

talk-douglas-training-ai-on-your-own-code

talk-dubnov-merge-rate-ai-adoption

talk-farley-vibe-coding-best-we-can-do

talk-firtman-web-mcp-agentic-web

talk-foxwell-reinvention-dev-team

talk-graziano-spec-driven-development

talk-groetzinger-skills-everywhere

talk-jones-odevo-ai-native-transformation

talk-jourdan-pipelines-to-prompts

talk-katsioloudes-code-security-ai

talk-lamis-context-engineering-dreaming

talk-lawson-agent-experience

talk-luebken-embedding-pi-coding-agent

talk-maleix-collective-intelligence

talk-maple-ai-native-devcon-welcome-slick

talk-maple-ai-native-devcon-welcome-spec-reviewer

talk-maple-aind-devcon-welcome

talk-maple-context-engineering-skills

talk-maple-continuous-ai-github-workflows

talk-maple-harness-engineering

talk-maple-tldraw-ai-canvas-experiments

talk-marsden-agent-desktops

talk-martinelli-spec-driven-development

talk-moss-skills-team-workflow

talk-overweg-one-brain-no-filtering

talk-podjarny-skills-are-the-new-code

talk-roberts-ai-native-brownfield

talk-roberts-brownfield-ai-native

talk-scheire-artificial-intelligence

talk-selajev-docker-sandboxes-agents

talk-sloan-harness-engineering-beyond-code

talk-stack-humans-architect-ai-writes-code

talk-stoneham-product-brain

talk-tal-skills-security

talk-thomas-ai-native-engineering

talk-walter-runtime-intelligence-agents

talk-wilson-cq-stack-overflow-for-agents

talk-wotherspoon-humans-vs-slop

README.md

tile.json

ainativedev/latest-aidevcon-speakers-london-2026

transcript.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}talk-luebken-embedding-pi-coding-agent/