[models / ecosystem] "think because for me this is not really the most exciting part to be honest I'm much more interested in everything that's now happening around it the ecosystem all of the integrations and so on and uh I mean obviously if we talk" -- L0069-L0074 (02:40-02:49)
[model selection / task reflection] "right? But this last one is a lot more about using the models and figuring this out, right? Which model do we use for which task, right? So there's I mean these are just like a few illustrative examples like uh we we have autocomplete like there's still people who use that a lot. Uh let's say or let's say you want to just change a few specific files and you have very clear instructions and a very clear idea of what you want to do or you have a larger and more complex change that needs a bunch of code" -- L0145-L0157 (05:20-05:44)
[task complexity] "we don't even have to know that much about the details of all of the different features of the models, but it's a lot more about us reflecting on these types of tasks, right? Which is something that we've done in the profession for for a long time, right? Like all of those like philosophical discussions about what does complexity mean, right? like that we have in in estimations or stuff like that, right?" -- L0170-L0180 (06:15-06:32)
[coding harness definition] "So moving on from the model then the next thing that we have around the model is what we what uh we've now kind of come to calling the coding harness right or uh you know also sometimes more colloquially still like the coding agent right so um that's the thing that's kind of helping us leverage the model for our coding tasks and it has things under the" -- L0249-L0257 (09:10-09:27)
[harness features] "hood like a system prompt or all kinds of like like other prompts that we usually can't even see when unless it's open source right I mean we recently got a glimpse into the cloud code once But uh a lot of them are a lot of the big ones that we actually use a lot are closed source right um it comes according comes with a tool integrations all of the standard stuff that we'll definitely need right uh changing files reading files code search is a big one right so we get like a type of code search out of the box which each with each harness that we uh pick um it has like all kinds of orchestration like most of them has have sub agents now for example also decide when to spawn of certain sub agents or they kind of decide like how many tool calls at once they pass onto the model and all of those types of things. Um maybe there's some caching involved in uh some of them they have a user interface of course right like some of them have a terminal based user interface some of them are like in VS code or uh or other more graphical um user interfaces and they also come with different levels of extensibility and observability. So" -- L0258-L0287 (09:30-10:35)
[extensibility / observability] "also come with different levels of extensibility and observability. So extensibility famously the PI coding agent is very popular right now as one that has kind of brought that more to our attention of having a coding agent that we can actually like that is malleable that we can change right and observability uh is also a space where there's a lot happening right now in terms of uh you know having traces of what the agent is doing uh how can we use that to for example analyze more like how we can improve uh how we're using the agent um getting some visibility into for example I've done some stuff about like visualizing for myself during a session which files is it reading and which files is it writing so I could get an idea of like blast radius of stuff so I think there's still a lot of potential here for us to also make these things part of the the review cycle" -- L0286-L0309 (10:33-11:24)
[context regulation] "So yeah, we have to understand their footprint, understand their features so that we can use these features effectively, right? And so most like one of the big things that we want to do is we want to understand how to regulate the context, how to tune the context that we give to the agent with the use of the features that the coding harness provides us. Right? So about 12 months" -- L0355-L0364 (13:01-13:18)
[harness engineering] "Um so I'm I'm calling it coder harness here. I have to say like I'm a little bit unsure. So this is term that uh that has now gotten traction since February maybe of harness engineering right which is basically this right expanding this coding harness. But in in other areas you know people also use harness engineering to talk about like how do you make the coding harness itself better right? So it's still like a little bit clunky term I would say. I wish we had we had a better one. I also jumped on the bandwagon and wrote an article about harness engineering. Um yeah it would be great m maybe somebody comes up with an even better word like the folks at Tesla here at the conference they've basically they've also had this kind of like trifecta that I'm presenting here right like of the model the harness and then the context right so I think it's like it's reasonable to think of harness engineering as context engineering for coding agents right so that's just like to get the terminology out of the way a" -- L0394-L0420 (14:29-15:22)
[guides / feed-forward] "the most common thing right now is this uh way of putting conventions product context workflow uh prompts like basically markdown files into our uh codebase somehow, right? Or like making them accessible through skills and so on. And in those markdown files is actually lots of different things going on, right? We have some normative stuff like coding conventions. We have some informative stuff like yeah product context. What are we actually doing here? Maybe reference documentation. Uh and then we also have instructions, right? Like uh always help me build in the following workflow or always write a failing test first or stuff like that. So there's actually lots of different things going on in something that looks just like a bunch of uh text at first and then also some of them we just have directly in the workspace and others are maybe more dynamically loaded from other data sources, right? And um so these are all for me like kind of feed forward. So, we're trying to anticipate what the agent might do wrong, and we're also trying to anticipate, of course, what we want it to do. And we're feeding it all of this information, these instructions," -- L0431-L0460 (15:48-16:51)
[feedback sensors] "we're starting with these guides, but then we also want to give it feedback, right? So um ideally uh so that we can trigger immediately a selfcorrection loop before we even look at the code so that we don't have to like have all those lowhanging fruits still in there. So um the most common way that people do that right now is with like code review" -- L0465-L0473 (17:01-17:18)
[inferential vs computational sensors] "a difference kind of between these. So like a review agent is an LLM judging the work of another LLM, right? So it's kind of inferential. It's running on the GPU. But we have a bunch of tools as well that are uh uh computational as I decided to call them here. So kind of things that run on the CPU, right? Like the static code analysis is the best example I think to to think about this." -- L0485-L0494 (17:46-18:05)
[computational guides] "Um yeah, and we have the same distinction on the feed forward on the guide side. So we we can uh we um can also think about computational guides on that side. And the best example for me there is code mods. Uh Ian from MEA also just mentioned those um which is for example tools like open rewrite that are really good at doing uh version upgrades and m migrations of uh of um frameworks. I don't know if you remember like quite a while ago, Amazon had a really big headline about saving 400 or 500 developer years or something for Java upgrades. That was under the hood actually mostly code mods being made available to AI. So that combination is really powerful, right? So all of these things uh or maybe providing a different type of code search that that is more effective for your really large codebase. All of those are ways again to increase the probability that AI does what you want in the first go." -- L0495-L0518 (18:08-19:01)
[agent mistake loop] "So that's then the expanded harness and then as a human as we've heard in a few talks uh here as well as the human our job in part becomes kind of steering this set of guides and sensors. So as Mitchell Hashimoto says in his blog post it's the idea that anytime you find an agent makes a mistake you take the time to engineer a solution such that the agent never makes that mistake again." -- L0519-L0528 (19:04-19:26)
[sensor placement] "And then you should think about like how you where you put those sensors when you run them, right? So, uh kind of like strategically think about your path to production and think about when you want to run them. So, do you want to run them in the coding session, right? Which is I think uh whenever that's possible in terms of like how cheap is it, how fast it is to run a sensor, I think you should run them like even before you commit, right? So, I have this box here about integration, right? So it kind of like depends what that means for you. So probably 80% of my commits in the last 15 years have been put straight onto the main branch which is probably not the case for most of you. Um so integration could either be like for you to say okay I want to do all of those things before I even create a commit or it could be as part of the pull request uh process where you run some additional like uh inferial sensors or something like that." -- L0561-L0584 (20:44-21:34)
[CI boundary] "Right? Then we have lots of stuff in our continuous integration pipeline already. Right? You probably don't want any inferential sensors in there because you you don't want, you know, the the green or red state of your pipeline to depend on semantic interpretation of an LLM, right? But we have like lots of uh computational sensors in there. And then" -- L0585-L0593 (21:36-21:54)
[production observability] "And then finally, I should also mention uh there's of course also this way of having sensors like in production that you give AI access to, right? Especially when it comes to your architecture fitness, things like scalability, latency, all of those things. There were a few talks here at the conference as well about using observability data and uh you know both to help you fix incidents but also just to like monitor how you can make your uh your runtime better." -- L0631-L0643 (23:14-23:40)
[autonomy / supervision] "continues, right? We want more agent autonomy and we want less human supervision. And as part of that also something that has happened over the last 12 months is that it has become a lot easier to run agents with uh no supervision. Right? So um this" -- L0666-L0672 (24:32-24:45)
[background agents] "uh over time now also most of the the harness products the coding agent products have now come uh come out with like platforms where you can always decide like do I want to run this coding session locally on my machine or do I want to do it in the cloud and so it's become a lot easier um to do this and uh to to try this out right like whatever size you feel comfortable size and complexity of tasks you want to do this for. Like some people run it to actually like build full features and others maybe like just dip their toes in like clean up this feature toggle or like small cleanup tasks, right? And uh also" -- L0676-L0691 (24:54-25:26)
[swarms] "in the meantime we've kind of like taken this like uh to more extremes right which is more an experimental stage right now I would say uh which is this idea of like swarms or really brute force just sending lots and lots of agents out there and also having the agents decide how many agents they need right um uh gas town got a lot of attention in I think it came out in in January there were these two like big or even more experiments from like cursor for example or anthropic to C compiler in the browser. Um, Cloudflow, I think it has a different name now. That was probably even earlier than Gast Town last year. Uh, kind of people were playing around with that. So, that's kind of like taking it to the extreme and seeing how we can push the boundaries and actually have AI build uh much bigger things more autonomously." -- L0692-L0713 (25:30-26:14)
[AI timeline] "yeah. So, this is our fourth year into this. Um so we start with autocomplete then we had a bit more like integration into the idees more context claw 3.5 sonnet was like an early model moment I think where uh I certainly from that point on just just always you almost always use clots on it because it just like um felt so much better at coding than the other ones. Um so then um we got these like at the time most people or I also in my presentations still called it agentic coding modes right so that's when like the the um like like uh cursor and so on they got like these modes where they could also run terminal commands which which had been a thing that was already out there in open source but not as widely used and MCP. So that was only about one and a half years ago. Um then shortly after that the the vibe coding term got coined by uh Andre Kapati which then led to like a lot of attention of people discovering these neurogentic modes and going oh this is like so there was like a wave of like people picking this up again and saying oh this is has actually improved quite a bit. Uh then we got these kind of background agents that I just talked about right like so codeex for example you know allowing you to run things unsupervised in the background cloud code is I think generally available probably about a year old it's like it started a little bit earlier than that the context engineering term also started gaining traction about a year ago uh then we had that claude opus moment we got skills open claw is maybe also relevant uh a relevant moment even though it's not quite directly about coding. Um, and then yeah, so like kind of beginning of this year, we got this like next wave of people paying attention again and going," -- L0715-L0760 (26:20-28:04)
[costs] "So it's come a long way but the costs have as well and I don't just mean the token costs." -- L0785-L0787 (28:58-29:02)
[changeability] "um then changeability is a big thing, right? So like code quality defined as code that remains easy to change and remains where it remains easy to change it with low risk, right? So this was a change uh I recently introduced into a still relatively new codebase that was all created with AI and I made a change that touched 41 files but it shouldn't have. it wasn't like that big of a deal and so it was like a clear smell that there was already like accumulate detected that was making changes more risky and more uh costly. So but then also here the question is like how far can we push this more and improve this more with guides with sensors with static code analysis and so on. Token cost is the" -- L0808-L0826 (29:51-30:33)
[cognitive load] "um another type of cost is like cognitive load and burnout right like who would have thought, right? It doesn't actually make us make our lives more relaxed on the contrary, right? Like some people are working more even though they they already create more output. And Steve Yaggi had this analogy with the energy vampire from what we do in the shadows. I don't know if any of you have seen that uh that show, but he basically yeah, he's a vampire that doesn't suck blood that but that sucks uh energy. So there's lots of stories about people saying, "Oh, I can only do this like three hours in a row and then I have to like take a nap."" -- L0847-L0863 (31:21-31:54)
[review crisis] "Um, then we have the review crisis of course, right? We have like higher coding throughput. But then can we if we can code faster, can we review faster, test faster, ship faster, review faster? We've definitely so far the state is no, we cannot review faster, right? Everybody's complaining about this about this pain. Uh, but it's not just coding." -- L0864-L0872 (31:58-32:16)
[flow crisis] "know, there was like another another kind of like thing popping up before the coding where the the product managers were actually like churning out lots and lots of prototypes and lots and lots of ideas, right? So now there was this weird like bottleneck between this pile of prototypes and the pile of code because they couldn't get it like to sync up again and nobody really could figure out how to converge on what they actually wanted to build because they were doing this kind of like in these two uh silos. So that's maybe like something like an open question that we're already seeing a little bit, but are we are we heading in general towards like a flow crisis, right? And whenever I want to understand flow better, I turn to my colleague uh James Lewis who like has done a lot of very interesting presentations about flow that you know it's easy to find on on YouTube. Um but here's one where he's talking about congestion uh collapse. Um, and so he has this uh this prediction apparently he has a bet open with Jean Kim for a crate of beer that this will happen and that this will become like a big topic of conversation that we're just like overloading and also overloading in these different silos and that everything will just become super slow at some point and just uh collapse. So this comes you know from I think theory of constraints and and stuff like that. And here he's he's quoting the Don Reinardson book about the principles of product development flow. um as well." -- L0879-L0917 (32:32-33:54)
[risk assessment] "it depends on the situation and um what does it depend on, right? So for me the way I think about it right now is as this risk assessment uh out of probability impact and detectability right which is very typical kind of components of risk assessment in all" -- L0934-L0940 (34:39-34:54)
[probability] "kinds of areas. So, first I think about the probability that AI gets something wrong or gets something right. And that's all about me knowing the things that I talked about before, my AI tool, like what my context is. Did I even give the agent a chance to do it right? Um, and it's also about me reflecting on my confidence in my requirements, right? Like how certain am I even that I even know what to do. Um, then I think about" -- L0941-L0951 (34:56-35:16)
[detectability] "the third thing is I reflect on detectability that AI got something wrong. Will I notice? Right? And um by the way, all of this starts with knowing what right and wrong means, right? I should and often that's like, is it appropriate? Right? Uh so you have to know your uh feedback loops basically and then based on these things I decide" -- L0961-L0969 (35:40-35:57)
[supervision level] "and then based on these things I decide which workflow do I use, how much review do I do and how long let do I let it go without supervision. For example, if I don't even know myself yet quite what I need, then I won't let it go off for like half an hour and then realize that it was all for nothing, right?" -- L0969-L0976 (35:57-36:13)
[cognitive costs] "So, we're tempted to move from in the loop to on the loop to out of the loop. This I I feel this every day being drawn to like ah I don't want to look at this. I don't want to look at the code anymore. It's like uh yeah, there's all of these forces that pull us there. But we're also starting to actually feel the costs. It's not just speculation anymore. It's not just like dooming kind of predictions. We're actually feeling the cost of tokens risks and cognitive X, right? Cognitive load, cognitive debt, right? this idea uh of like we don't even understand anymore how our codebase is structured. Um I sometimes think of like cognitive deferral as well. It feels like we keep like deferring the review to other people or like deferring processing what actually happened. And recently uh there was like a new cognitive X term coined that um I I saw because Addiosmani wrote about it which is cognitive surrender. Right? So" -- L0991-L1014 (36:48-37:41)
[cognitive surrender] "great. And so they're they're talking about a mode where we're basically displacing system 2 like our actually like active thinking with AI and they call it cognitive surrender. And apart from like this paper and the cognitive part of it, this term like surrender has just been stuck in my head like ever since I heard this. And I feel like we're there's like so many things where we we're in danger of just like surrendering right now, not just cognitively. Um and so I think we have to be careful about what we are surrendering, right? And like really think about be mindful of uh where" -- L1020-L1035 (37:55-38:27)
[junior sustainability] "instead of teaching somebody right everybody's talking about oh how will juniors learn how will we do this but I don't see that much like active action right because everybody's just in a tunnel of you know I see experienced people building tools for themselves to use AI better but like why are we not taking more initiative to think about how this is sustainable for people who don't have all of this experience" -- L1041-L1051 (38:43-39:04)
[leadership / environment] "So if you are a person of influence in your engineering organization, are you creating an environment that leads to surrender, right? That makes people feel like they just have to crank out the PRs and don't have time to like actually figure out how to improve the environment, how to improve the context engineering and so on. Um if you're if" -- L1082-L1090 (40:10-40:25)
[sphere of influence] "you maybe feel a bit like powerless or you feel like you can't influence it and there's all these pressures around you still like try to think about your sphere of influence and the small things you can do like what do you really have to surrender right um so there's lots of options between not using AI at all and like just total surrender and hoping hoping the models will fix it all in the future and again like if you feel kind of like powerless and like this is like washing over you we can also work together and collaborate on these things Right? So if you feel like you're not a good communicator about this, maybe look for somebody in your team or organization who's good at that and share your data, share your observations with them. Right? So like we can collectively do a lot about this as" -- L1091-L1111 (40:28-41:06)
[closing skills] "well. In terms of skills, we need to know our toolbox past and present. We also need to rediscover some things. Uh we need critical thinking, risk assessment, and some form of like patience, right? So we need to evaluate that productivity maybe doesn't just mean like typing and for leadership you know this is still horizon 2 right this is not horizon one yet so maybe we don't know the ROI yet maybe we just have to like give people some time to um to create a good setup so that we can continue to safely and quickly deliver software to users in a sustainable way." -- L1112-L1126 (41:08-41:41)

.tessl-plugin

talk-azriel-executable-specs

talk-baker-sadogursky-context-engineering-skills

talk-batey-building-product-teams-age-of-ai

talk-birgitta-closing-keynote

talk-cormack-tests-lie-observability-ai

talk-debois-agent-enablement

talk-douglas-training-ai-on-your-own-code

talk-dubnov-merge-rate-ai-adoption

talk-farley-vibe-coding-best-we-can-do

talk-firtman-web-mcp-agentic-web

talk-foxwell-reinvention-dev-team

talk-groetzinger-skills-everywhere

talk-jones-odevo-ai-native-transformation

talk-jourdan-pipelines-to-prompts

talk-katsioloudes-code-security-ai

talk-kerr-bipolar-disorder-dysregulation-ai

talk-kushwaha-benchmarking-agent-era

talk-lamis-context-engineering-dreaming

talk-lawson-agent-experience

talk-lopopolo-harness-engineering

talk-lubken-embedding-pi-coding-agent

talk-maleix-collective-intelligence

talk-marsden-agent-desktops

talk-martinelli-spec-driven-development

talk-moss-skills-team-workflow

talk-obstbaum-willoughby-vibes-to-metrics

talk-overweg-one-brain-no-filtering

talk-podjarny-skills-are-the-new-code

talk-roberts-ai-native-brownfield

talk-roberts-brownfield-ai-native

talk-ruiz-agents-on-canvas-tldraw

talk-scheire-artificial-intelligence

talk-selajev-docker-sandboxes-agents

talk-sloan-harness-engineering-beyond-code

talk-smith-connecting-context-future-transports

talk-stack-humans-architect-ai-writes-code

talk-syme-agentic-repository-automation

talk-thomas-ai-native-engineering

talk-trieloff-browser-agents

talk-walter-runtime-intelligence-agents

talk-wotherspoon-humans-vs-slop

README.md

tile.json

ainativedev/aidevcon-2026-ldn

quote.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}talk-birgitta-closing-keynote/

Quotes -- State of Play: AI Coding Assistants

quote.mdtalk-birgitta-closing-keynote/