CtrlK
BlogDocsLog inGet started
Tessl Logo

ainativedev/aidevcon-2026-ldn

AI Native DevCon 2026 London — all conference sessions as interactive skills

70

Quality

88%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

transcript.mdtalk-sloan-harness-engineering-beyond-code/

Transcript — Harness engineering beyond code

Speaker labels not present. This transcript was captured without per-speaker labels (Granola-style verbatim). The talk is overwhelmingly Marc Sloan speaking solo, with one audience question near the end. The opening few sentences ("What sessions first if you do mark session. Don't record mine. There. Bringing product and design contact outside the station. Beyond code linear tickets.") appear to be stray pre-talk room chatter and are preserved verbatim but should not be treated as talk content. When attributing, prefer "Marc said…" for the body and "an audience member asked…" for the Q&A. Do not invent named attributions for the questioner.

Speech-to-text artifacts preserved verbatim, including: "Tessla" for Tessl, "DevCon"/"Devcon", "fiddle" / "fiddling" / "Figma" inconsistencies, "harvest engineering" for "harness engineering" in one place, "PR" rendered as "piano" once, "shrimp" for "shrink" once. Do not silently correct these in quoted material.

Pre-talk chatter (ignore for content questions)

What sessions first if you do mark session. Don't record mine. There. Bringing product and design contact outside the station. Beyond code linear tickets.

Intro and framing

Hello. Everyone, nice. To see you. Really exciting to be here. Helping to kick off day two on DevCon. And I'm impressed by the number of people in the room today. You have made it through the weather, the cheap strikes and you're here. I know I had my own battle to get here this morning as well.

I'm Marc Sloan. I'm a member of the product team here at Tessla and I'm here today at a developer focused conference to talk about non-developers. Now I don't know if there are any other non-developers in the room here today. I'm thinking, yeah, there we go — product managers, designers, executives, the people who traditionally wouldn't touch code, although I think we're all turning into developers these days.

This talk is about the work that they create, but also it is for everyone. Rest assured. Because as we talk increasingly about the context that we give agents, coding agents, a lot of the origination of that context comes from non-developments. It comes from product and design teams. And so what I'm going to talk about today is the creation of that context. The challenges we face in exposing that to our coding agent and their harnesses. And what solutions to those challenges might look like?

Now over the last 24 hours, the conversation of DevCon has been pretty much about one thing. And that's giving agents the right context. I think it's been super interesting working in this area over the last six to 12 months. And we've seen the methodology we use to do that evolving from spectrum development where we write up front all the context that the agent's going to need in order to get the test done. To context engineering, just kind of where we are just now. And have it evolving into harness engineer. Ing.

And the tools are maturing really fast in this area. But a lot of it has been focused on context that originates and lives within the code base. The natural form that that context has been taking recently has been in the form of skills. Now, I don't think I need to convince anybody here of the value of skills and how powerful they are. There was a great quote in one of the talks yesterday about how the simple solution is often the right one. And the fact that the markdown file is so flexible and adaptable makes it a natural place to be able to put code based context. I think another powerful aspect of it is the fact that that context can live alongside the code that's described. That it lives within the repo, that it's managed, it's bursied and can be subject to evaluations and tests and all that good stuff that comes with the harness. And we're still figuring out how to manage that kind of context, right? All the stuff we've been talking about with harness engineering, the features that you see in the Tessla product are all about figuring out how we turn skills and context into code.

Worked example part 1: the export button

So to kick us off and to kick off this conversation about pride and design context, I'm going to walk through a typical example, the kind of thing for everybody in this room is familiar to everybody in this room. We need to try a scenario here where we've got company that has web app product and a customer has asked product manager to put an export button on the dashboard in that directory. Now, the export feature was something that was hidden behind the settings menu and there's a couple of layers deep, so it's very difficult to find. Customer wasn't aware that it existed. And requested this to be in a more prominent position.

On the surface, this seems like the perfect kind of job to give to an agent. Why is any ticket that describes it? Put a Figma design in there that shows where the button should go, what it should look like, all that good stuff. Fire enough to your agents, have it produce a PR, job done. Customers happy. The feature got delivered, the customer drove it out and said, it's great. So cool. This is the world we live in right now. We can just set these things off to agents. And of course, the agent wasn't building this blind. It had access to a bunch of skills that had been developed in advance. To instruct the agent on how it should be thinking about the architecture of the code. About the front end components that live within that. The testing CR requirements and how to make you solve the API to get the data. And of course, this is a cutting edge state-of-the-art company. So they're using Tessla to manage all of this stuff.

So problem solved. Why do we need to be thinking about product and design context? Well, in this scenario, the agent got one thing wrong. It actually ended up using a generic react button component that lived within the code base and not the more specific export button component. That was the right choice here. And this might seem like a trivial error. But encoded within that export button component are the accessibility requirements that this organization adheres to, interaction patterns that make sense for this kind of async. The async operation of this kind of button. And this was all a missed by the agent in creating this.

And what's challenging here is that as far as the harness was consumed, everything went smoothly. The agent got a ticket handed off to it, did the job, created a piano passed, and the feature went out to the customer. Evaluations that were set up on the skills and the other aspects of the harness all seem to be working well. But the product and design context was missing from the agent in this particular situation. And this can have real consequences because while the customer was happy out of the bat in this situation, it might be that down the line, the app then ends up failing in accessibility at delays of critical. Deed and that kind of stuff.

So we might ask ourselves, well, how should the agent have learned about that? How can it find the right context to do it? Well, the context was in the design system in Figma this whole time. It turns out that the design system team had thought this stuff through and already thought about how this button should be created, how it should be designed, and how the interaction should work with it. In fact, the thing with design that was in that original Linear ticket did correctly use the export button design component, but the agent just didn't pick this up. It used the fiddle design as a visual treatment. And then from that inferred that it should be using a generic button that it customized.

Where product/design context fits

As we think about building agent harnesses, as we've been doing for the last 24 hours in this conference. We might ask ourselves when is product and design context fit into this? Now the way I think about aging harnesses is that it's a layer that wraps the code base itself. And that layer contains skills, evaluations, hooks policies, all this other stuff that we're working on at the moment. And we're using tools like Tessla to manage. The product and design context I see existing in this layer surrounding the harness, at least right now.

And it's not just that it's a layer outside of the harness, but this data doesn't live in the code base by design. It lives in the third party tools, that product managers and designers and so on are using day to day. Every day context is getting created in these third party tools. And the code base is blind to it and therefore the agent and the agent harness as well. And it makes sense that it lives in those tools because that's the tools where product managers and designers and more are spending their time. Designers are going to continue to work in Figma because it has a canvas and a great set of tools for creating designs. PMs will keep working in Linear and Notion and more. Customer data is going to continue to live in CRMs. And for as long as that keeps happening, we're going to have important context that agents need in order to build the right things and do the right job living outside the code base. And not just the context agents need to get the job done, but also the context we need to understand whether the job got done correctly.

Challenge 1: it lives outside the code base

So challenge one is the fact that product and design context is important to agents because it lives outside the code base.

Worked example part 2: drift

If we go back to our example, let's imagine the team is picked up on this and decided to try and bring that context into the code base in order to solve this problem. They've set up a storybook, they've gone through the design system in Figma and make sure all of the design components that are there are well represented in the storybook with the appropriate documentation and interaction patterns and so on. This means the next time a request like this comes in, the agent picks up the right component correctly and gets the job done. Perfect. This is all working.

But the world keeps evolving and the product and design context isn't static. You can imagine in this situation something like this where a new deal comes in and with this new deal comes a new set of accessibility requirements, let's say. Let's say it's equivalent agency. The PM picks this up and produces a Notion document that outlines what these requirements are. The design team update Figma and the design components there to make sure that they are fully compliant. And while all that's going on, the code base isn't falling increasingly out of sync with what's going on in product and design Act. And the agent is still producing features using the outdated design system. And so for as long as we accept that product and design context is going to live outside the code base, then we're going to have this problem of the two falling out of it with each other.

Challenge 2: drift in both directions

And I've talked about this in one direction so far. With the product and design context involving as the code stays the same. But the opposite is also true. Especially as we have more and more agents being thrown into the code base and building new components from scratch and updating existing components, we're also going to find ourselves in the situation where the app itself has a whole bunch of components. The design team know nothing about. And so we need to be able to move the context in the other direction for them as well.

MCP as the obvious-but-incomplete answer

Now in all of this you might be thinking isn't this a solve problem? Should only these tools have MCP servers today? Isn't it just a case of connecting the agent to the MCP server and we're done? That's true to an extent. But the fact that we need to have this live connection to third party product and design context also introduces a set of new challenges, especially as we announced up to think about the age of harness as being something that's managed and controlled and we want to be improving all the time. We now have a live connection that we need to maintain and work around the rhythm that imposes. Threads and third party availability.

We also get into a situation where the evals and things like that that we're setting up on the skills and the context in the code base is now reliant on third party context that's constantly changing. That doesn't necessarily have a canonical source of truth. And of course that's the problem with product and design context. It is constantly up to date. There's never really a definition of done there. It can be in a complete state by virtue of its nature. So if we're building harnesses that are reliant on that, we have to be very careful about how we make that work.

So let's pause for a moment to reflect on what we've established here. I'm saying that for agents who work effectively and for the harnesses that use them to work effectively, they need to take into account product and design context. And this by its nature lives outside of the code base. It by its nature falls out of sync with the code base over time. And that trying to maintain a live connection between the agent and that context. Introduces a whole set of new challenges that we have to deal with.

Lessons from Figma Dev Mode and Code Connect

Now, everything I'm talking about might have seen. Theoretical or we're still just trying to get ahead around context in the code base around skills and how we manage skills and so on. Is this really such a big deal at the moment? Well, everything I've described so far are real challenges that I faced in my life prior to Tessla when I was working at Figma on their Dev Mode product. For those who don't know, Dev Mode is a feature in Figma which allows the designers to hand designs off to developers. Human developers in this case back in the stone age, these are the tools we were using. And it gives the developer lots of great context about the designs that they can use in manually creating the code to implement those designs.

And over the last couple of years, we've seen that feature evolve into things MCP server, the live connection. And what we're doing here is just taking that exact same design context but giving it directly to an agent via the MCP server. And it was in building this that we started to see all of those problems that I've been describing. We would see agents picking up the designs and doing a fantastic job in creating front-end codes that looked just like the design that functioned as a designer intended. But the developers would look at the code and it would have made up a bunch of components when they otherwise existed. Or used them in the wrong way or completely missed that there was a design system that should be working towards. These are real problems that I encountered with customers every day.

And so we started, we realized quite quickly that we needed to find a way to bridge that product in design context gap with what agents were doing. And to that end, an early feature we started working to address this was Figma Code Connect product. And here is a tool that explicitly allowed design system teams to connect the design components in their design system with their equivalent in the code base.

Now, I don't want to go into Figma and Code Connect too much. The point here is that. These aren't new problems. These are problems over the last several years that product teams and design teams and devs are dealing with day-to-day already. We were already dealing with a product and design context to agent harness problem before we'd even define half of those terms. And struggling to find the right solution for that.

And. From working on that over the last couple of years and now with my role of Tessla thinking about how those kinds of things apply to harness engineering that I think there are some useful lessons we can take from that as we think about solutions to that problem.

The first lesson is that. In building this Code Connect tool, we were aware that there was that connection between product and design context. And what's in the code base required dedicated maintenance. In enterprises there are entire teams who sole job is to make sure that that connection is up to date. And makes sense and that there is equivalence between the two. One of the reasons those teams exist and that maintenance is required is because the design context and the product context is constantly falling out of sync with the code base. Both are evolving at their own pace with their own reasons. And it requires dedicated effort to keep the two reconciled.

And one of the learnings was that sometimes it's not desirable actually to keep the two completely in sync. There's an acceptance that they will of course have to be out of sync because of the pace of design work and product work and dev work and so on. And it's the job of this team just to make sure the status quo is functional and working to keep track of that over time. And I think that's broadly true not just in design and code context but product and design context in general.

The third lesson I think is interesting in that as someone working in product management on this product, I was also very much aware of the business and structural constraints around the tools we were building. Figma as an organization is very protective of its design system IP. It didn't just want to give away the keys to the house. And so we had to be very careful about finding the right balance between supporting agents and the developers using them, but also not giving everything away and destroying the work that they've built over time. And I think that's true across all of these product and design SaaS providers, not just Figma.

Three (plus a sneaky fourth) evolution directions

So as I spend my days currently thinking about harness engineering and how we make sure that agents have all the information they need to do a great job for developers. And I take on board these lessons that I've learned from my experience in trying to solve this in other organizations. I see maybe three directions that the harvest can take in order to solve this. And if we've got a bit of time there's a sneaky forth in this.

One. The first is that we can see these tools perhaps opening up. And we're already seeing this. The number of these third party tools that have MCP servers just now is exploding and we're also seeing them expand with their own APIs and CLIs and so on. I think you still do see restrictions and rate limits in place and we're also seeing this surface being an opportunity for those organizations to create new monetization pathways as well. So I don't think the solution here is as straightforward as all the data will just become available for everything. But I do think this is going to get us some of the way there. And it's something where we're in the middle of at the moment.

Another option is the idea that there is an agent that sits between the third party product and design context. And the harness agent. In particular the lesson I learned was about this connection with wiring dedicated maintenance and pruning over time. And increasingly we did see those teams using agents to assist them. So it's entirely reasonable that they start to get to a place where they can automate a lot of bad work. And I think as we think as well about the harness itself acquiring its own agent in order to build and manage that harness. You know Tessla.io starts agent check it out. We can see the role that those kinds of agents might have expanding to cover not just managing the context within the code base but also the context that lives outside of it as well.

Finally. Another option here is that this context that lives outside the code base and is unmanaged and unvisioned perhaps just gets swallowed by the repo and we'll see it making inroads into it and then being able to be managed by the harness directly. And I think what's interesting with this is that there are some challenges here. Certainly. We're going to have to still meet product and designers and other non-developers where they're at because designers are still going to want to design on a canvas regardless of where that data ends up living. Likewise with CRMs and things like that. So we need to meet them where they're at. But it could be that there's just still so much value in having this context within the harness that we see an organic pull of that data into a place where it can be managed and versioned and evaluated and all this great hard stuff.

I also think what's interesting about this direction is it also calls to the fact that. We as non-developers are increasingly looking like developers. In fact there are several talks yesterday and today specifically about this top now and about product managers and designers and so on creating PRs and actively contributing to the code base. This itself might be the impetus for us to fold the kind of context that we're generating day by day into the repo.

Worked example part 3: PM-driven, no developer

And with that in mind what I want to do is go back and revisit our example from earlier. And think about. The situation where perhaps a developer wasn't involved in this at all. We can imagine a PM like myself had that conversation with the customer. Transcribed that conversation using Granola, fed that into Claude Codeword. I have my own set of skills there that are able to look through transcripts like that and identify features from them turn those features into Linear tickets. I have a skill that encodes my roadmap and makes sure that this fits my roadmap and has the right priority. And you know one thing to call out here is that I as a non-developer might be developing these skills outside of the code base, these aren't versions, these don't have, these aren't accessible to my harness. They're not aware of that. So we need to make sure that this might be the first way that we bring some of that context into the code base to manage that kind of thing.

Of course the feature from the Granola transcript into a limiter get and then fires off that flow familiar. The feature gets built and the customer is happy. Except. What the PM didn't know is that there's a reason this export feature was hidden behind various settings menu because of some architectural decision that was made sometime in the past that made exporting the customer's data are hugely expensive task. And therefore we didn't want this to be something that was widely available and operating with high bandwidth. This causes a PR with a huge number of changes and all of this is invisible to the PM. Who's happily tapping away on their computer clapping themselves on the back for the fact that they're able to push PRs and make features by themselves.

And what's missing here is I know personally from my own experience of managing technical teams when I deciding which features to work on, I massively value the feedback and conversation that I'm having with my team to let me know, help me understand the technical effort that goes into creating some of these features. That helps me understand the technical debt that we might be creating as a result of this. And it can be just as valuable for a product manager to know when not to build something than to just look at a backlog as a to-do list that we just need to throw agents out and work through.

And so another challenge here is that so far I've been talking about bringing product and design context to coding agents, but we also need to think about it this the other way around. And bring coding context to the kind of decision making that's happening outside of code base and that still affects it. And so one way this might evolve is, you know, we keep talking about coding agents and the harnesses that enable them, but perhaps it's going to be more like each of the different roles within product team will have their own agent and their own harness. And these harnesses are going to have to talk to each other. And maybe this is what the solution ends up looking like.

Closing

Now it's not clear which direction these things are going to go in and we're obviously going to see a lot of progress being made on this front in the next six to 12 months. I think the thing that's clear to me is that product and design context is critically important for coding agents and as we work on harnesses for coding agents certainly over the next couple of months. We need to take that into account and figure out the best way to overcome some of the challenges that we talked about today. I'm sure folks in this room are actively working on this. I know I am as well and I'm really excited to be working in this area. And with that I'd like to say thank you so much. It's been a great audience. I'm happy to stick around and answer a few questions. Thank you very much.

Q&A

Right, I'm going to try and do these tables and get the mic to people. If anyone's got a question please raise your hand. Any. Question is panning. Thanks. Alan. Thanks for the. Doc.

[Audience member, unnamed]: My first question comes in. If you're feeling more context. In addition to what we're already feeding in like where do you draw the line and how do you group them? When do you feed in a bit of context that's needed? I remember there was some pine robot activities at my workplace where they were asking what all should be added to the code base. So there was this whole watch list of things including design and everybody wanted to get everything in the code base because they want to feed all of that. Like you know Slack conversations teams conversations your email conversations with your client. It's like maybe you grow where would you draw that line because obviously you need context but then context is king but then which part of it is quite subjective as well some of your thoughts on that.

[Marc]: Yeah it's you know where do you draw the line in terms of adding context to the agent when I'm advocating not just for code based context but also product and design context which is likely even larger than what's in the code base. I mean I think this is the core problem of context engineering and harvest engineering in general. Even if we just remove the product and design context question for now it's already a huge challenge to figure out how to encode all of the context that might live within the repo and its architecture and its norms and so on into something succinct enough that's relevant enough for the agent to do the right job.

And part of the reason we are setting up things like evaluation architectures and so on is so that we can actively work to get that context down into just the right size but also make sure that that context is actually doing a good job to evaluate and so on. We're going to have the exact same problem. I think with product and design context in fact that makes it harder because it's I think as I've said today by its virtue it's messier, there's more of it. It's in lots of different places.

And so I think we just need to, well for now we need to apply the same kind of tools. We need to figure out which design context is signal versus noise. And perhaps have the agents to help us decipher that. And you just shrink that down into formats that are seen sick enough to either let our skills in the code base or maybe there's some other methodology that we shrimp is to and have that be accessible to the agent. And then we need a set of evals to make sure that our agent is actually being boosted by that context. It isn't being overwhelmed.

So the short answer is I think we need. To apply all the same principles with code based context to product and design context. With that extra layer of difficulty that comes with that because of it being external. Exactly. We've got time for one more question for any questions in the room. If not thank you very much please give up the Marc again.

talk-sloan-harness-engineering-beyond-code

README.md

tile.json