Back to articlesMaking Claude good at Go using Context Engineering with Tessl

19 Jan 202613 minute read

Rotem Tamir

Rotem Tamir is a Go developer, ex-CTO and Co-founder for atlasgo.io and consultant at honeybadge labs.

Table of Contents

Context Engineering for Gophers with Tessl

Evaluating Tessl’s impact on Go development

Conclusion

Back to articles

Making Claude good at Go using Context Engineering with Tessl

19 Jan 202613 minute read

TL;DR

The problem: LLMs aren’t great at Go: they hallucinate APIs and miss idioms, especially for newer library features not in training data
The “context engineering” fix: Provided agents with accurate, up-to-date docs using Tessl and its MCP.
The proof: On a real bug, we show how Tessl + Claude Opus 4.5 hit 100% success rate, ran 1.6x faster, and cost 3x less ($0.10 vs $0.30/trial)

2025 was the year coding agents got real — but they’re still erratic. Superhuman one moment, burning through my token allowance on nonsense the next. I’ve learned to compensate by manually feeding them context: logs, docs, code references, hoping to keep them on track. This practice has a name: context engineering. As Tobi Lutke (CEO of Shopify) wrote:

“it is the art of providing all the context for the task to be plausibly solvable by the LLM.”

When the current generation of agents (I am writing these lines weeks after Claude Opus 4.5 was released) fail, it’s often a context problem, not a fundamental issue.

As a long-time Gopher, I’m feeling this hard. Models are still kind of crappy at Go—they stumble over idioms that any experienced developer would find obvious, and they don’t know important Go package APIs very well.

When the team at Tessl extended the invite that I join as a consultant to help bring my love of Go to make agents better with it, I was more than happy to accept.

Context Engineering for Gophers with Tessl

Tessl is all about enabling coding agents to work on real life, large scale modern codebases via context engineering.

Here’s the thing about modern software: according to the Synopsys OSSRA report, 96% of applications include open source components, with the average codebase consisting of 76% open source code. To be good at building software, agents need up-to-date, concise context about your dependencies.

For this, Tessl built the spec registry: a way to “install” bits of context about your deps to supercharge your agent. Contrary to other solutions, Tessl’s approach is a rigorous, scientific process. Every spec goes through extensive evals and refinement to ensure the context you get is high quality and effective. It takes Tobi’s idea of context engineering from an art to a
streamlined, engineered process.

I joined Tessl to build good support for Go. While context engineering is a language-agnostic concept, making it work is all about specifics: having the subtle yet relevant nuggets of data while reducing noise as much as possible.

Today, we’re excited to announce our modest beginning: Go’s top 50 most important libraries are now available on Tessl’s registry. It’s free to sign up and use.

Getting Started

Let’s demonstrate how Tessl works (and why it’s beneficial) with a simple example:
a basic calculator CLI!

To make this guide short and sweet, I assume you have some things pre-installed:

Recent version of Go (I’m using 1.25.3)
Claude Code installed and logged-in
brew (to install tessl)

1. Initialize our Go project

First, initialize a Go module:

$ go mod init calculator-cli
go: creating new go.mod: module calculator-cli

2. Install the Tessl CLI

brew install tesslio/tap/tessl

3. Install `kong`, a CLI library

There are a bunch of CLI-tool frameworks available in the Go ecosystem, but my favorite has
always been alecthomas/kong - let’s install it:

$ go get github.com/alecthomas/kong
go: added github.com/alecthomas/kong v1.13.0

3. Initialize Tessl in your project

From your project’s root directory:

tessl init --agent claude-code

This will authenticate you, and configure your coding agent for MCP support.

You will find some brand new files in your project directory:

├── .mcp.json
├── .tessl
│   └── .gitignore
├── go.mod
├── go.sum
└── tessl.json

2 directories, 5 files

Tessl uses MCP to expose spec-registry documents (called “tiles”), so you can find

.mcp.json files for Claude Code. Additionally, tessl.json keeps track of the installed tiles and .tessl will contain the actual tiles. Think about these like package.json and node_modules/ for the Node.js ecosystem.

4. Install dependencies and tiles

Now, let’s install the tiles from the registry and see if Tessl has a tile for our library:

tessl search kong

? Select a tile to install
 tessl/golang-github-com-alecthomas--kong
❯   1.13.1

Kong is a command-line parser for Go that enables building complex CLI applications through declarative struct-based grammar definitions
↑↓ navigate ─ ⏎ select ─ Esc cancel

Great! Tessl has a tile for the Kong library we use. Let’s hit ENTER to install it:

✔ Installed tessl/golang-github-com-alecthomas--kong@1.13.1

Notice a few more files were added to our repo:

$ tree -a
.
├── .mcp.json
├── .tessl
│   ├── .gitignore
│   ├── RULES.md
│   └── tiles
│       └── tessl
│           ├── cli-setup
│           │   ├── steering
│           │   │   └── query_library_docs.md
│           │   └── tile.json
│           └── golang-github-com-alecthomas--kong
│               ├── docs
│               │   ├── core-parsing.md
│               │   ├── errors.md
# ... truncated for brevity
│               │   ├── tags.md
│               │   └── utilities.md
│               └── tile.json
├── AGENTS.md
├── CLAUDE.md
├── go.mod
├── go.sum
└── tessl.json

13 directories, 28 files

Let’s break down what got added:

Tiles with documentation (.tessl/tiles/) - The actual context for our dependencies. Notice golang-github-com-alecthomas--kong/docs/ contains pre-extracted, agent-optimized documentation covering Kong’s parsing, hooks, mappers, tags, and more. This is the good stuff that’ll make our agent actually understand how to use Kong properly.
Tiles with rules (.tessl/tiles/tessl/cli-setup/steering/) - These teach the agent how to use the knowledge tiles effectively. In this case, query_library_docs.md instructs the agent to look up library documentation before writing code.
Agent instruction files (AGENTS.md, CLAUDE.md) - Project-level guidance that Claude and
other agents read automatically. This is where the Tessl system hooks into your coding assistant

Think of it this way: the documentation tiles are the what (documentation), and the rule tiles are
the how (when and how to use that documentation). Together, they transform a generic coding agent into one that actually knows your stack.

5. Let’s code!

With all of that configuration behind us, let’s see what Claude can make with our enhanced setup:

claude --allowedTools 'mcp,Read,Edit,Write,Bash,Glob,Grep' \
-p "Build a Go CLI using alecthomas/kong with an 'add' subcommand that takes
 two int flags (--a, --b) and prints their sum. Use struct-per-subcommand
 pattern with Run() methods, no switch/case."

Claude turns and tosses for a while, but after a minute or two, it comes bearing good news:

Perfect! I've successfully built a Go CLI using Kong with an 'add' subcommand. Here's what was created:

**main.go:1-29**

The implementation uses the struct-per-subcommand pattern as requested:

- `CLI` struct contains the `Add` subcommand
- `AddCmd` struct defines two required integer flags (`--a` and `--b`)
- `Run()` method on `AddCmd` implements the logic (calculates and prints the sum)
- No switch/case statements - Kong automatically routes to the `Run()` method

**Testing results:**
```bash
$ go run main.go add --a=5 --b=10
15

As LLMs are known to be overconfident so let’s try this ourselves:

$ go run main.go add --a=100 --b=10
110

$ go run main.go add --a=100 --b=100
200

Let’s look at the produced code:

package main

import (
    "fmt"

    "github.com/alecthomas/kong"
)

type CLI struct {
    Add AddCmd `cmd:"" help:"Add two integers and print their sum"`
}

type AddCmd struct {
    A int `help:"First integer" required:""`
    B int `help:"Second integer" required:""`
}

func (a *AddCmd) Run() error {
    sum := a.A + a.B
    fmt.Println(sum)
    return nil
}

func main() {
    var cli CLI
    ctx := kong.Parse(&cli)
    err := ctx.Run()
    ctx.FatalIfErrorf(err)
}

Great! We’ve successfully burned through countless tokens and electricity to produce a trivial program that works!

Evaluating Tessl’s impact on Go development

If you are a skeptical engineer like me, you are probably asking yourself, "Come on, Claude Opus 4.5 would have easily solved the problem without any external help, how do we know if Tessl has a positive impact on performance?"

Luckily, Tessl has published an extensive report on the impact of tiles on coding agent performance. The headline: tiles deliver ~35% relative improvement in accuracy over baseline, with even stronger gains for recently released libraries (50%) and new API features (up to 90%). Agents complete tasks faster while using fewer turns, and the cost-efficiency is roughly 2x better than giving agents raw source code access. It's a fascinating piece of research - I highly recommend reading it.

To illustrate the process with a Go-specific example, let's run our own experiment.

Fixing a bug

Tessl’s research shows that enabling coding agents using Tiles improves performance in general, but especially when working on newer code. Intuitively this makes sense: recent additions to a library’s API might be scarce or absent from the model’s training data, meaning an agent would have to either look for documentation or analyze the library’s source code to make progress.

For my experiment, I chose a bug (”Unable to parse negative positional argument”) that was recently fixed in alecthomas/kong. As described in the issue, the library would error whenever supplied with negative numbers as argument flags. In our case it would manifest as:

$ go run main.go add --a 100 --b -100

main: error: --b: expected int value but got "-100" (short flag); perhaps try --b="-100"?
exit status 80

Something funky is going on there with parsing the flags. Kong interprets -100 as a short flag rather than a value. The workaround is to use = syntax:

go run main.go add --a=100 --b=-100
0

It works, but it’s inconsistent - users expect --flag value syntax to just work. To support this use case, on May 15th, 2025, the authors added the WithHyphenPrefixedParameters option.

This is a great bug for evaluating the contribution of Tessl to Claude’s performance. Because the change is fairly recent, for a not-so-high profile library, there’s a good chance it hasn’t seen it in its training data set. We can gain more confidence by asking:

Benchmarking Tessl using Harbor

Harbor is a framework for evaluating AI agents in containerized environments. Created by the team behind Terminal-Bench, it lets you run reproducible benchmarks against coding agents
in isolated Docker containers.

A Harbor task bundles an instruction, container environment, and test script:

kong-negative-args/
├── task.toml           # Timeouts, resource limits
├── instruction.md      # What the agent sees
├── environment/        # Dockerfile + source files
├── solution/           # Oracle solution (solve.sh)
└── tests/              # test.sh → writes reward to /logs/verifier/reward.txt

Our instruction tells the agent to fix TestAddNegative by modifying main.go—using Kong’s API, not workarounds like --b=-2:

The A/B Setup

To evaluate Tessl’s impact on Claude’s performance I’ve prepared a small Harbor benchmark with two nearly identical tasks (find the full code on GitHub):

Task	Tessl	Description
kong-negative-args	❌	Baseline - no knowledge tiles
kong-negative-args-tessl	✅	Has Tessl installed with Kong tiles

Both tasks are set up like this:

The initial state of the code (what we built together above)
A failing test that reproduces the issue above
An instructions prompt asking the agent to fix
A verifier script that provides the reward score (1.0 for success, 0.0 for failure)

The only difference: the Tessl variant includes .tessl/tiles/ with Kong’s documentation and steering rules that guide the agent to query library docs before writing code.

Running the Benchmark

We run each task 20 times (-k 25). The baseline uses Harbor’s built-in Claude Code agent, while the Tessl variant uses our custom adapter that configures MCP and injects the token.

# Baseline (no Tessl)
uv run harbor run \
  --path tasks/kong-negative-args \
  --agent claude-code \
  -k 25

# With Tessl
uv run harbor run \
  --path tasks/kong-negative-args-tessl \
  --agent-import-path agents.claude_code_tessl:ClaudeCodeTessl \
  -k 25

Results (n=25 each)

Duration distribution

Key findings

We ran 25 trials each comparing Claude Opus 4.5 with and without Tessl MCP on a real Go CLI bug requiring discovery of a newer Kong parser API.

Higher success rate

Tessl MCP achieved 100% success vs 92% baseline
2 baseline trials timed out entirely while hunting for the correct API

Faster completion

1.6x speed improvement: 69s average vs 112s baseline
Tessl runs showed tight consistency (62-81s) vs high baseline variance (57-178s) from search luck

Lower cost

3x cost reduction: $0.10 vs $0.30 per trial
Token usage tells the story: baseline averaged 381K tokens vs Tessl's 129K

Why it works

Tessl's query_library_docs tool provided the answer directly, yielding consistent 15-19 step completions
Fewer exploration steps mean smaller contexts and lower costs

Bottom line: Reliable API context eliminates the variance and failures inherent in web search, making agents both faster and cheaper.

Full benchmark code and raw data: rotemtam-tessl/kong-tessl-eval

Conclusion

Context engineering isn't just about throwing more documentation at your agent—it's about providing the right context at the right time. In this benchmark, rather than letting Claude search the entire web for Kong documentation, Tessl's query_library_docs tool delivered only the relevant slice of the Kong parser API—nothing more, nothing less. This precision is what transforms context engineering from an art into an engineering discipline.

The cost advantage is real

Beyond correctness and speed, the token savings translate directly to your bottom line. At 3x lower cost per task ($0.10 vs $0.30), the economics compound quickly across a development team. Fewer exploration steps mean smaller contexts, which means less money spent watching your agent wander through outdated Stack Overflow answers.

For Go developers, having accurate, up-to-date context about your dependencies means fewer hallucinated APIs, fewer outdated patterns, fewer tokens burned, and less time cringing as you watch your coding agent drive straight into a wall.

The top 50 Go libraries are now available on Tessl’s registry. Give it a try—your future self (and your token budget) will thank you.