AI agents might be improving at many things, but there are certain areas of the software stack where they still struggle: deciding which tools to use, and when.
As developers wire large language models into real systems — from coding agents to internal developer platforms — those agents are increasingly expected to choose between dozens, or even hundreds, of internal tools and APIs. And this has exposed a real scaling problem. Passing every tool definition into a model upfront eats into context limits, degrades accuracy, and makes agents unreliable once systems grow beyond demos.
Addressing that class of problem has become a key focus for Anthropic, as it looks to push agent capabilities into production environments where tool sprawl, cost, and reliability lock horns.
Why Tool Search matters inside Claude Code
Alongside the launch of its powerful new Claude Opus 4.5 model back in November, Anthropic also unveiled a handful of features on the Claude Developer Platform, bundled under what it calls advanced tool use. This included a new Tool Search Tool designed to let Claude discover and load tools on demand, rather than requiring entire tool catalogs to be passed into the model’s context window upfront.
Fast-forward to January, and Anthropic quietly announced that it was applying that Tool Search capability directly to Model Context Protocol (MCP) inside Claude Code, bringing dynamic tool discovery into MCP-based toolchains where large, real-world tool catalogs have begun to strain existing approaches to agent design.
MCP is an open protocol developed by Anthropic for Claude Code and other agent frameworks to expose tools from external servers. Thariq Shihipar, a member of Anthropic's technical staff working on Claude Code, said the new functionality emerged as MCP servers and agent toolchains began to grow far beyond what could comfortably fit into a model’s context window.
“As MCP has grown to become a more popular protocol and agents have become more capable, we've found that MCP servers may have up to 50+ tools and take up a large amount of context,” Shihipar wrote. “Tool Search allows Claude Code to dynamically load tools into context when MCP tools would otherwise take up a lot of context.”
According to Shihipar, the new functionality enables Claude Code to automatically switch from preloading MCP tools to search-based loading once tool descriptions consume more than 10% of the available context window. He described the update as a response to one of the most requested features on Claude Code’s GitHub repository — “lazy loading for MCP servers” — citing user setups with seven or more MCP servers consuming around 67,000 tokens before an agent even gets out of bed.
Treating tools as something agents can discover on demand brings reusability and discoverability into focus, since tools no longer need to be embedded directly into prompts or loaded upfront as part of a fixed setup.
In that context, shared registries can be a practical way for teams to surface and reuse existing helpers rather than rebuilding them repeatedly. Tessl, for example, maintains a public registry that catalogues agent tools and helpers so they can be discovered and reused across projects.
Community reacts: Why tool search matters for agent systems
Much of the community discussion around Tool Search emerged following its initial launch on Anthropic’s developer platform last November, but it helps explain why the capability is now being applied more directly to MCP-based workflows in Claude Code. Tool Search removes a major scaling constraint for agent systems, but it doesn’t eliminate the need for careful tool design — poorly specified tools remain difficult for models to reason about, regardless of how they are discovered.
Alex Salazar, co-founder and CEO at AI tool-calling platform Arcade.dev, said the feature shifts the bottleneck rather than removing it, making tool quality and discoverability more important as agent systems scale.
“You can give Claude access to a thousand tools, but if they're poorly built, it doesn't matter,” Salazar said. “Bad tool definitions lead to bad tool selection.”
Salazar pointed to recent conversations with large enterprises as evidence of how quickly the problem emerges in real-world scenarios. Once agent systems are connected to more than a couple of dozen tools, he said, token usage and tool selection accuracy degrade, pushing teams toward ad-hoc workarounds. Some have attempted to compensate by layering in retrieval pipelines or custom routing logic, but those approaches often introduce new performance constraints rather than addressing the underlying issue.
The importance of Tool Search, according to Salazar, isn’t simply that it allows agents to access more tools, but that it changes how those tools are presented to the model in the first place. Rather than forcing agents to reason with an ever-growing set of tool definitions in their working context, the approach shifts tool access into a retrieval problem — reducing both memory pressure and the risk of degraded decision-making as systems scale.
“Before, these models had to keep every possible tool in its working memory at all times,” Salazar wrote. “Now it can offload that and search through it when needed. It's like the difference between keeping everything in your head and referring to a dictionary. Just like your brain, giving the Claude models the ability to keep tools in a ‘dictionary’ means reducing the taxing load of holding onto all that memory while also improving accuracy.”
Those same dynamics are now playing out more concretely inside Claude Code, as Tool Search is applied to MCP servers where large, fast-growing toolchains have begun to highlight the limits of preloading tools into context.
Not everyone is convinced this approach resolves the underlying issue, though. In a Reddit thread discussing the feature’s rollout in Claude Code, one commenter argued that the approach risks “moving the problem again,” suggesting Anthropic is compensating for poor tool hygiene rather than solving it outright.

Another pushed back, countering that the long-term goal should be to remove the need for manual tool management altogether, describing Tool Search as “a step in that direction.” The exchange highlights a broader tension in agent development: whether the future lies in ever more intelligent infrastructure to manage growing tool complexity, or in better abstractions that make such management unnecessary in the first place.
Tool search, ultimately, reflects a shift in how agent systems are being designed, moving tool access out of the context window and into retrieval. Throwing MCP Tool Search into the Claude Code mix shows how that shift becomes necessary once agents are wired into large, real-world toolchains. As tool catalogs continue to grow, treating tool access as a retrieval problem rather than a memory one is likely to become an important foundation for building agents that operate reliably across complex software environments.
Join Our Newsletter
Be the first to hear about events, news and product updates from AI Native Dev.





