Speaker-label warning: This transcript has no per-speaker labels. It is overwhelmingly delivered by Maximiliano Firtman ("Maxi"); the session host Simon Maple opens and closes but his words are minimal and not separately marked. The transcript also contains many speech-to-text artifacts that should be read with charity, including but not limited to:

"Web MCB" → Web MCP

"MCB apps" → MCP apps

"Cloth" → Claude

"JNDPT" / "JGPT" / "charge GBT" → ChatGPT

"OpenCloud" → likely Opencode (or similar agent)

"CloudCode" → Claude Code

"Corsair" / "Cursor" → Cursor

"colleagues" (when said as an entity) → Google

"Gong" → DOM

"by coders" / "by coding" → vibe coders / vibe coding

"Asians" / "ancient" → agents / agent

"burning Belgians" → burning billions

"ORIGIN trial" → origin trial

"ASIC operation" → async operation

"darts on Doom" → a [port of] Doom

"Aspirin" → async (or similar)

"verify several tools" → register several tools

"exit" (after agent decides) → execute

"exposed to" → exposing

"fix it" (re pixels) → pixels

"Webex" (when alongside Claude/ChatGPT) → likely Web apps or another agent host

Preserve the verbatim text below when quoting; interpret in surrounding prose.

§1. Speaker intro

How to make your web apps. Our web apps faster and cheaper for coordination, actually for every kind of patients. So really quickly, my name is Maximeliano. I can go by maths. It's alumni. I'm from Argentina, so I live in Buenos Aires. I've been an app developer for almost 30 years, right? Web developers for third year and also have done mobile apps. I have some books, online courses, a different provider. And I'm also colleagues ambassador from OpenAI. Both are working together in different ways.

§2. The agentic web — three vectors

Of course, this is just kind of a in terms of the agentic web. That's kind of the name. So we know that agents want to build for the web, So with most people, even by coders are asking for an app, I mean the result is just my web app. The web is kind of the in prime time when you are doing by coding. But also the web wants to run agents. This is also kind of a starting, but now you can inject in your web apps or your website different agents in different ways. You can inject in an agent that can run with a web app. And also, have something known as MCB apps, that are actually running, for example, inside Cloth or inside JNDPT, inside AI apps and Webex are running And there, can also have a web inside the main Also users want to browse the web using agents agentic browsers, Chrome, at least in The US, another another countries, you have Gemini there. Have JGPT Atlas. So while the user is actually browsing the website, the user can't actually access a chatbot or an agent to do something on that website. And also agents by themselves want to browse the web. So in different ways. So today, we will focus on those three items.

§3. How agents access the web today — and the costs

Okay? On a solution that we have or that we will have that will let us help as web developers. We are web developers and we want to enhance the ability of agents to work with that. We know that we have the workout. Company. In a few days. So let's say we created a website or a web app Generally, follow the work, the FIFA work. So you have a way to search for matches, you can follow your team. K? And you have things like tape tickers so you can see the match that you have on each date. If you are asking any any agent there will OpenCloud, OpenClaw. It can be Codex. It can be CloudCode. Generally, check information on this website It may take time, understand the user interface. There are many ways that agents are today accessing the web it takes time also context and my tokens. So it is time, tokens, and context. And we know the context window is also crowded, but has token use. So example, if you want to see a round of 15, the semifinals, and final, So all the information that you have here for a nation to tell you understand the information, will take time, maybe a couple of minutes. To actually browse that website as a user and get that information.

So when agents are browsing the web, today they are using several techniques. First, you can create a connector. Or you can use, you can mask the agent to use a connector. Like a network Or the company behind that website has to implement or a CLI. I don't know. You you tell me something in npm, so then you have an npx command that you can execute or creation can execute in terminal. If it has permission. To actually get the data from your service. Then we have the old fashioned way to fetch to make an HTTP request. The problem is that when an agent is trying to use that technique such as cork, it won't execute the intrastate. Or any user interface. It's just getting the HTML from the the server. If it's a single page application, it's not working. So you're not getting a real date. Of course, you can use now browser usage on many agents, plugin that you can install mostly on Google Chrome or Chrome based browsers. For example, You have one, you have another one for colleagues. And in that case, the agent, the coding agent can talk to that plugin and that plugin can read your Gong and other parts. It can be also the accessibility tree. And it can take screenshots from from transcript. It's taking screenshot from transcript and you can analyze the content. Though, so it can be the browser plug in, So it has its own browser. That is kind of a web view, view based browser that can also use the agent to browse the web as a user, to finally, at the last step, we have computer users. So these days, if you wanna you want your collocation to use Safari. Because you want to see, you want to test how Safari is rendered in your website, It's not so simple for Asians such as today at least. To actually connect to Safari. So they're just using computer usage that is just using your computer. Like using a mouse, cursor, fake of the colab So the problem is that agents are burning Belgians. When the browser is just a gaze a guessing game. It's observing screenshots that don't the accessibility tree, Then it's in theory, what needs to be done So I need to click there. I need to click on the date picker Then I need to take another screenshot, for example. I need to activate over that inferred data such as I need to click on that calendar icon actually see that there is a calendar that has been opened And of course, we need to repair or repeat this yeah. When I click on the calendar, need to take a new screenshot and then see what's happening on the page So this is consuming a lot of the game. Tokens. And Context. From our window, from our context window. So it's a problem. So we need a new solution.

§4. From inference to contract — Web MCP intro

That's kind of And WebMCP is here to try to solve solve some of these problems. I mean, not all the problems, but some of Because WebMCP will let the front end expose capabilities and not just fix it. So without WebMCP, the agent is watching, is analyzing, is inferring over pixels selectors. Dumb guesses, or retrial and with retrials, which again every time it clicks or multi what's on the screen and And now we are passing from inference to a contract that we as web developer define. That's WebMCP. In this case, we are going to expose two. The tool will have name, a description, a JSON schema which will be assembled in a second, and execution function. Then, in this case, maybe Jasmine, for example, a Jasmine function then that the agent can actually exit So it's failed in ferry. As a user with the UI, we have a contract. We are exposed to an interface to the agent.

§5. Why not just MCP?

So but then you guys thinking, we already have the MCP. Why do we need Web MCP? Well, the AI agent is kind of in the middle here. It's planning it can use MCP, but MCP is kind of connect an agent directly to the back Servers, APIs, data, workflows. And that has some challenges. Authentication challenges, on some websites maybe the user has some local storage, an index EV, or the user can also can can be talking about sensors. Right now, for example, we have web apps display glasses. So you can run JavaScript on web apps in your classes. And you have sensors there. Well, maybe the agent that can be running into your class needs data from the sensors. You're not going to get that from the back end. From the server. Right? That's what we have Web MCP. That is kind of the way that we have to talk to the front end. Right? So it's not one or the other. It's actually both. It depends on what do you need. You will be using. MCP Server or a Web MCP tool that you are exposing in your web. The Web MCP Then has access to the front end, and that means the current page where you work, scroll, where the scroll is, If you have form, the data that the user or the agent has typed into that form, You have that's the The whole user session. And every client API. Maybe you have a serial, a web serial, device connected to that and then you can use that. Okay? So this is important.

§6. Inspired by MCP, not the same protocol

Web MCP is inspired by MCP but not in from. So it's not actually the same project. It has similar ideas to explore transposed tools to the agent So the other day I've seen on Twitter, I don't remember he said that Web MCP is to MCP as is to Java. Okay? Which is not a bad definition. So it's inspired by the same idea, exposing tools to the Asian. But it's not using the same protocols. It's it's completely different. Okay? So what I'm sitting it's a proposed standard API. For the small API to a data from the W3C, but it's going to that direct The author, in this case the website author, is then providing tools for the context of AI or actually can be AI chatbots. As well, but

§7. What is a tool? + origin trial status

So what's a tool So the name here is tool. Well, it's just can be a new function. It can be any form. That you have in your website. So, I mean, you are receiving support tickets. You have a phone. Well, that can be a tool. Receiving form data for a new ticket a new support ticket, it's actually a tool. It's gonna be in origin trial. Tumor. So if you are watching this session live, tomorrow we will have Chrome 149, it's going to be an origin trial, meaning that you can start using it with real users. Out there. If not, you can enable Slack. Or run Chrome with the flag. And also you can try this. But if you're watching this, in a recording session, then it's probably already there. Of course, agents will fall back to any other technique if the tools are not provided by the website. So it's important here to understand that Web MCP needs your work. So you need It's not going to appear automagically So for complications in particular, because it can be also OpenCloud or it can be any agent, but for polynations in particular, testing. Testing is to test the user behavior. As a user test as a user. Maybe including now you have another test to make. So if you have Web MCP, can also do unit testing over your tools. So you need more testing, actually. So But they will later use tools on WebEx faster and fill forms on the web with better results without actually guessing what should be filled where. So that's kind of the idea. And this is interesting. You can create and debug tools that then you can use with your current testing solution. So you can actually gather data in the front end for your test suite. That then you can run and get from Web MCP. So for example, you can expose list of errors. So you can expose a tool that will collect so you which aspect you collect earlier errors happening in in that current session. And then with WebExp, you get. A list of those Or you can run diagnostics. Again, with the full browser context in the front end. You can describe the current view, You can get, like, done from the current view. See them verified to know the the state, to actually see this in action,

§8. Tool object anatomy (imperative API)

every tool is just an option The object will have a name, descriptive name, a description, This is interesting. The description should be but for Asians, So we need to understand that our customer is an AI agent. Not the user. Right? And then we have an input schema. Where we're going to specify that's the JSON input schema. We're going to specify the information, the parameters, the arguments, that that particular tool will need from the agent. And finally, we have an execute function. That it's going to be executed by the Web MCP runner when the agent decides to actually use that particular tool. So the agent will, as with MCP, the agent will collect, will ask the browser or will ask your web app actually, all the tools that you currently have available. And based on that, and based on the prompt or the goal from the agent, it will execute then one of those two

§9. Web MCP Inspector + World Cup demo

We have on prem a Web MCP inspector that looks like this. For example, this is a fly search. This is borrowed from the cronty. That example. Is kind of the standard example. When we're talking about WebSocket. It's talking about, okay, let's put flight And instead of writing the things in the form, you just call a tool. So from child it's like a child's group function. But for our demo, the workout website, gonna look like this. Example, if I sue me into the Web MCP pools available, I can have get matches by date, get team matches, receive with the team, list rules. List teams, search matches, and so on. On. That's the name. Then we have a description and the input scheme. So for example, if I'm asking when is the second match for England, it's going to give me an answer based on executing the tool? If I ask, is it possible to see a final between Argentina and England, For example, In this case, it makes several calls to several tools with different arguments The agent the agent is making this calls. And then it's giving me an answer. And yes, we can have a final between England and Argentina. These are the two options. So one needs to win the group to win. Around. In the group. Meet in the Even I can ask something like this. Who will win the cup? Okay? And I'm not sure if I wanna show you the next slide. And I wanna go home safe. Tomorrow. So but anyway, I I can show you this is the answer from. The two. Okay? Actually, we have that final. Okay? Of course, it it can be this is Gemini, by the way. So the model behind this in Chrome is Gemini.

§10. Imperative vs declarative API

So the APIs actually are available in two forms. You can use imperative API, and that's Aspirin. Okay? You expose tools. As JavaScript functions. And we have declarative API. So it's actually So for example, this is the imperative API. API where And you pass the object that I mentioned before. Including the execute function. That's all. Of course, you can verify several tools. You don't need to register all the tools on the page. Now, Based on the context, of the app one particular state of the app, You can have a set of tools that may change over time. So you can remove tools can register and unregister tools, based on how the state of the app is evolving. Or changing. So execute extraction promise So actually any ASIC operation will work. So if your frontend needs to also execute your back end, it's fine. Or maybe you need to talk to a Bluetooth device. And that's an ASIC operation. Well, it's fine. You can still use Web MCP tools for that. And the other one, the Where you define tool name and tool description Those are kind of the two mandatory options. You also have a tool auto submit. Boolean attribute that you can define Can express You accept that form to be support submitted by the agent. And then if you want, on your four elements, input element, text areas, select. Or web components, that are form elements, you better find the tool param description. That will add more information to the agent on what needs to be filled there. If you fill that the maybe the label is not enough. But if you feel that the labor is enough, for example, for the first name, you don't need to ask. K? So don't then the agent will see that and say, okay, we have the create ticket tool available and this RBI

§11. Doom demo

So the biggest question is does it run How do you think? And that's the only way that we have these days to see if the technology is powerful. Enough. Right? So then I did this. So I did darts on Doom enchants out there, open source enchants, And I added Web MCP to it. So for example, it's running through. Here. Right? So there are a couple of tools there. And I can actually say, for example, hey. So move forward. Fail. Ten seconds. And it's now running it's one and two. It's not just running one one tool. At a time. You can also say rotate and move, because the agent can run more than rotate. And also I can say something like, look at this. See that this is really powerful. Play Doom. As a maniac. One minute. Okay? It takes a time and it fast. I do. Okay? So The agent. Okay. He is playing you. So use it while I'm sleeping. Okay. You got it. Right? So now, agents can play

§12. Using Web MCP today — Chrome DevTools MCP / Puppeteer

But then if you want to use this today, in a combination, I mean, Cloud Code, in Corsair So how can I use it? Well, the we have it really today. Can use. That is an MCP not Web MCP. An MCP tool From Chrome Devils So you have their Chrome now is offering DevTools as an MCP tool and through that MCP tool you can actually run Web MCP So in this case, we have a codeination any coordination, supports MCP. All of them. Then you connect it to DevTools MCP. DEP tools MCP, creates a Chrome engine, a Chrome instance, By the way, when MCP doesn't work at least now, on headless routes because it's more suitable for a real real context Okay? But this this may change. So we have a light cone. And over that Light Chrome, it's run a website or And then it will list all the tools available. It will execute the tools you can do that this swap you know how to So in the future, may expect browser tools Collination to support Web MCP directly. So then we don't need that bridge. For example, Codex may include directly whether report. Charge GBT for MCP apps Or when you have a browsing plugin for OpenCloud, it can support Web MCP directly. Okay? But this is still experimental.

§13. Tool design guidance

So to design tools is kind of design an API. So it's an API. So what's a tool? Your Web MCP list of tools, So some ideas use only one purpose per tool. So don't create tools that are overlapping, So then you are decreasing the or you are increasing the ability of the agent run the right tool. K, for each situation. So be state aware and register only when useful. This is what I said before that you don't need to register all your tools at once. Because I know if it's a web app, at don't maybe you are logged in, maybe you are logged out. Tools may not be the same in those situations. Play language. Describe what happens Remember this is going to be accessed by model. By an agent that is using the model. So you are speaking to a model, basically. So a strict code return meaningful errors Remember? Your consumer is not the user. It's the AI agent. So you can actually respond with technical errors of okay, the parameters, the arguments are wrong and what's going on. So then the agent can iterate and solve those issues without any human intervention. And try to return a small output exactly what has been requested. By the way, for example, the Doom example so I created a couple of tools for, of course, moving and also to get information about what you see on the screen. You can get the screenshot in day 64 as a PNG. And also you can get information about your current position in the map. And I'm trying to guess if you are to guess if you are to towards a wall or not. That's a boolean. So those are the tools that I'm exposed course, there are more tools and less tools. You can expose based on the ancient. But the ancient is providing

§14. Status, spec stability, adoption recipe

So this is really promising. But also remember that this is a field experimental There are a lot of discussions going on. In terms of what would added or from the spec with all the actors around AI agents But right now, we have already passed the what's it? Today? Today live with this session live. It's standard under discussion. And it's under a chrome flag. So you need to enable the flag. E Chrome. But again, tomorrow, is gonna be in ORIGIN trial. If and with your the origin of your website, And you will be able to use it directly in Chrome. As it has to try also, it means that the final API may change. And it's also We have experimental flat. So if you wanna use the Puppeteer, it's also available in T Station. Example something that you can use from a combination. So it's the best moment to learn from it. To start discussing the if you want to add support for it, to enable like cheaper and better access to your web app from any agent including COVID-nineteen. So just one suggestion. You. If you want to start using this in your coding agentic flow Okay? So you can pick just one high value page state that you have in your web app, and expose a read only diagnostic tool from that stent. From your web. Then you evaluate that with calls and arguments And you, of course, you're going to use that for cloud code, from Corsair, OpenCloud Then you automate that using that use MCP or Puppeteer. Those are the two options that we today. Mainly in a couple of months, it's like native, for the rest of not today. Then it will look like this. You will be able to ask why is this page would the tool will you the the tool is going to be run by like MCP,

§15. Wrap

we can make Web MCP faster and cheaper for collisions, Remember, it's an experimental API. It will expose tools to the agent from the front end, not the back end. That's the diff with MCP. It will use the full browser context and you can use it right now with origin trial in Chrome for end users, or using Chrome DevTools MCP or Puppeteer. Okay? So it said that, and with two seconds left, Thank you.

§16. Host close (Simon Maple)

Thank you so much. I I love Or press the button that says I stepped outside. And then you'll get 10 points Easy, it? We'll be back here in half an hour for another session. While you're on coffee. Bar. So Book We get a real

.tessl-plugin

talk-azriel-executable-specs

talk-baker-sadogursky-context-engineering-skills

talk-batey-building-product-teams-age-of-ai

talk-birgitta-closing-keynote

talk-cormack-tests-lie-observability-ai

talk-debois-agent-enablement

talk-douglas-training-ai-on-your-own-code

talk-dubnov-merge-rate-ai-adoption

talk-farley-vibe-coding-best-we-can-do

talk-firtman-web-mcp-agentic-web

talk-foxwell-reinvention-dev-team

talk-groetzinger-skills-everywhere

talk-jones-odevo-ai-native-transformation

talk-jourdan-pipelines-to-prompts

talk-katsioloudes-code-security-ai

talk-kerr-bipolar-disorder-dysregulation-ai

talk-kushwaha-benchmarking-agent-era

talk-lamis-context-engineering-dreaming

talk-lawson-agent-experience

talk-lopopolo-harness-engineering

talk-lubken-embedding-pi-coding-agent

talk-maleix-collective-intelligence

talk-marsden-agent-desktops

talk-martinelli-spec-driven-development

talk-moss-skills-team-workflow

talk-obstbaum-willoughby-vibes-to-metrics

talk-overweg-one-brain-no-filtering

talk-podjarny-skills-are-the-new-code

talk-roberts-ai-native-brownfield

talk-roberts-brownfield-ai-native

talk-ruiz-agents-on-canvas-tldraw

talk-scheire-artificial-intelligence

talk-selajev-docker-sandboxes-agents

talk-sloan-harness-engineering-beyond-code

talk-smith-connecting-context-future-transports

talk-stack-humans-architect-ai-writes-code

talk-syme-agentic-repository-automation

talk-thomas-ai-native-engineering

talk-trieloff-browser-agents

talk-walter-runtime-intelligence-agents

talk-wotherspoon-humans-vs-slop

README.md

tile.json

ainativedev/aidevcon-2026-ldn

transcript.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}talk-firtman-web-mcp-agentic-web/