Logo
Back to articlesMy Coding Agent Invented an API That Doesn’t Exist (And Blamed CORS When It Failed)

6 Feb 202616 minute read

baruch

Baruch Sadogursky

Baruch Sadogursky is a Developer Advocate who helps developers move from vibecoding to spec driven development, with deep experience from JFrog and now at Tessl.

Viktor Gamov and I decided to embarrass ourselves on camera again. (See our previous episode of public embarrassment here).

The premise: control a smart bulb with a webcam by detecting the dominant color and sending it to the bulb. Simple enough that we could build it live, but complex enough that the API is obscure and the agent have to actually know things.

TLDR:

  • Spec-kit defines what you're building.
  • Tessl tiles provide the knowledge to build it correctly.
  • We combined them to control a smart bulb:
    • The agent found "critical gaps" in its own research when it compared its plan against actual Tessl documentation!
    • When we wiped the context mid-stream, it recovered from the spec files alone!

Vibecoding would have been debugging fictional endpoints for hours.

(The stream, because Viktor and I are old movie fans, is called “Agent Johnson, Special Agent Johnson, No Relation” (wink-wink, Die Hard groupies). It’s coming to a YouTube near you.)

The Hardware (Sort Of)

Viktor built a simulator for a Shelly smart bulb. The thing looks like an egg on screen, which is perfect for demos because I can share my screen on a stream instead of trying to point my camera at the actual bulb. We do use the real thing for conference on-stage demos!


Image with a bulb. It looks like an egg on screen.

The real Shelly Duo RGBW has a REST API with published documentation, but it’s not exactly trending on Hacker News. No viral GitHub repos, no Stack Overflow threads with 47 upvotes—the kind of API that exists in the long tail of the internet where LLMs fear to tread.

We’ve tried pure vibecoding with this API before. Asked Claude to “use the Shelly Lightbulb API” and watched it invent something like /api/v1/light/color. Looks plausible! Professional even! Completely fictional.

Then it hammered our bulb with requests that went nowhere. And when nothing worked? “Must be a CORS issue. Let me add some headers.” Classic agent cope—hours of context window burned, money spent, developer patience exhausted.

Enter Spec-kit: The Process Part

Spec-kit is GitHub’s answer to “my agent keeps building the wrong thing.” It’s a set of slash commands for Claude Code that implement spec-driven development.

The core insight: code generation is solved. The intent-to-code chasm is the hard part—the gap between what’s in your head and what ends up in the codebase. So spec-kit forces you through phases before you write a single line of code, bridging that chasm with documentation instead of hoping the agent guesses right.

Core Commands for the Spec Driven Development Workflow

Here’s the workflow:

Constitution → Governing principles that deliberately avoid tech stack or architecture decisions. What do we actually care about? TDD? Simplicity? Modern UX? Write it down, and everything else validates against this.

Specify → What are we building? User stories. Acceptance criteria. The agent asks clarifying questions instead of assuming.

Clarify → Gaps in the spec? The agent identifies them and interviews you. “The spec mentions devices, but Shelly has multiple bulb cores with different APIs. Which should we use?”

Plan → Now we talk tech. JavaScript or Java? (Viktor: “I don’t care.” Me: “As long as it’s Java.” Viktor: “Too late.” Me: “…”)

Tasks → Granular breakdown with checkboxes, dependencies, and phases.

Implement → Finally, code.

I love the concept of the constitution. Every time I type /speckit.constitutionI imagine myself in a powdered wig at the Continental Congress in Philadelphia, debating founding principles with Benjamin Franklin. There’s a flag behind me. Bald eagles are screeching. Someone’s drafting amendments. The whole thing.

Viktor thinks it’s too enterprise. He’s wrong. The Founding Fathers were visionaries with excellent penmanship, and there’s nothing enterprisy about that.

Constitution in Practice

Initially, spec-kit generated five principles for us: real-time responsiveness, local-first architecture, and technical standards, way too much for a demo.

So we called a constitutional convention and simplified it to three principles:

## Core Principles

1.Test-Driven Development
2.Demo Simplicity
3.Modern UX/UI

Three principles, no tech stack, no architecture patterns, just the governing ideas that the rest of the process must respect. We the People of this Smart Bulb Demo, in order to form a more perfect application…

Later, the analyze phase caught this:

HIGH PRIORITY

TDD Violation in Phase 2: Tasks specify implementing code before writing tests. This violates constitution principle #1.

Image showing the various issues including high issue in TDD violations.

The agent reordered the tasks to put tests before implementation because the constitution required it, not because I remembered to check.

This is what I mean when I say spec-driven development bridges the intent-to-code chasm. Your agent can forget what you’re building, but your constitution can’t. It’s ratified, archived, and guarded by a metaphorical bald eagle.

The Problem Spec-kit Doesn’t Solve

Spec-kit handles the process beautifully, but it assumes you already have the knowledge to feed it. Where does that knowledge come from?

We looked at Context7, a community repository of LLM documentation. It has a Shelly API entry with a red dot—didn’t work when we tried it.

Context7 - Search for Shelly API

And even when Context7 works, it has fundamental problems. There’s no notion of private spaces, so your organizational libraries and proprietary APIs have nowhere to go. It will give you the docs for the latest version it ingressed, so it might not be the version you’re looking to work with. And the scope is guessed by the system, which usually means too much context gets loaded, and your context window fills up with irrelevant details. Why does this matter? Stay tuned.

Enter Tessl

This is where tiles come in.

A tile is a reusable piece of context—documentation, skills, and steering rules, that you can version, share, scope, and control.

Tessl Tiles Image

In my Tessl dashboard, I have a workspace called Shelly Cloud. Inside it, a tile called “Shelly Gen 1 API.” Purpose-built for exactly what we’re doing.

Inside the tile:

Shelly Gen1 Local AI HTTP API

Here’s the critical difference: I explicitly define scope: Gen 1 API is in, Gen 2 is out. If you want Gen 2, you create a separate tile. Your agent loads exactly what it needs, nothing more.

Viktor asked me on stream: “What’s the downside of having too large a piece of context?”

Good question. We’ll get to that.

The Marriage: Spec-kit + Tessl

After running through the constitution, specification, clarification, and planning, the agent had done its “research.” It scraped Medium articles. Found outdated blog posts. Assembled a technical plan based on what it could find on the internet.

Time to sanity-check it against actual documentation.

tessl init

This adds a tessl.json to the project. Then I told Claude:

Using Tessl MCP, bring the relevant Tessl tiles for the selected tech stack and APIs.
Bash command for tessl search shelly

The agent searched the registry, found my Shelly Cloud tile, and installed it. Now tessl.json lists the dependency:

{
  "name": "shelly-bulb-sync",
  "mode": "vendored",
  "dependencies": {
    "shelly-cloud/shelly-gen1-api": {
      "version": "1.0.0"
    },
    "tessl/npm-vitest": {
      "version": "4.0.0"
    },
    "tessl/npm-vite": {
      "version": "7.3.0"
    }
  }
}

It also pulled in tiles for the Vitest framework because I told it to grab everything relevant. Now re-run planning:

/speckit plan

That was a mistake. When I run plan, spec-kit creates a new template for a plan and then instructs the agent to write the plan. But we already had a plan! It was overridden with a template and recreated from scratch. Some tokens were wasted, and today we learned. Generally, no harm done. But when it recreated the plan, things got interesting.

Image sharing the critical gaps found by agent

The agent discovered its earlier research was incomplete. It didn’t find the discovery protocol and was unaware of the recommended verification steps.

The plan was rewritten using up-to-date information from both the documentation and the steering parts of the tile.

The Context Window Reset Test

During our stream, I watched the “context left until autocompact” indicator drop to 9%. If we were vibecoding, that would be terrifying—once the window fills up, the agent starts forgetting things, and it’s game over.

With spec-kit? I don’t care. I preach to run /clear deliberately between steps to free up the window. The knowledge isn’t in the agent’s memory anymore; it’s written down in the constitution and spec files. Like that movie Memento, except with markdown files instead of ink.

Time to prove that theory works. I wiped the context entirely. Not a gentle /compact, but a brutal, unforgiving /clear. Fresh start. The agent has zero memory of what we were doing.

/clear
/speckit tasks
Spec-kit.tasks output on CLI

Because everything is documented—the constitution in specify/constitution.md, spec in specs/, plan in planning/, tiles in tessl.json, the agent picked up exactly where we stopped. It read the artifacts, understood the project state, and continued generating tasks.

After vibecoding, this feels like magic—the difference between your agent knowing what’s going on and your agent asking, “Let me first understand what kind of application this is,” while re-ingesting 100s of files.

Every token spent re-learning your own code is a token you’re paying for twice.

The Purple Problem (And How We Fixed It)

Sidebar from the stream: Viktor has opinions about AI-generated UI.

“When you see a website generated through agents, it will look the same. Black and purple. All of them.”

He’s right. The models were pre-trained on whatever was trendy on the internet. For a while, that was dark mode with purple accents. Now every AI-generated site looks like it was designed by the same person in 2023.

We installed a UI/UX plugin (Claude’s proprietary format for a collection of skills) called “UI UX Pro Max” (incredible name, I know). It has recommendations for different app types—SaaS dashboards, chatbots, landing pages—with specific guidance on color schemes, typography, and interaction patterns.

Example of Webcam Preview UI

The result was a clean UI that doesn’t scream “I was generated by AI”—no purple, actual design choices that looked intentional. (Well, you see purple, but it’s because my backlight is purple. I kinda confused you with this one. But the UI doesn’t have any purple!)

Viktor wants to rebuild his entire presentation website using this tile. That’s the power of reusable context—you don’t just use it once.

What We Actually Built

By the end of the session:

  • Constitution enforcing TDD, simplicity, and modern UX
  • Specification with user stories, acceptance criteria, and edge cases
  • Technical plan validated against actual API documentation from Tessl tiles
  • 62 tasks broken into phases, properly ordered (tests before implementation, because, you know, you don’t argue with the constitution)
  • A working web app that correctly controls the Shelly bulb simulator
Final working app showing the webcam feed and connected bulb displaying a detected color]

The UI is clean, the API calls work, and the tests are in place.

Did we hit bugs? Sure. The detected color display was black initially (turns out the preview component needed fixing). The IP input field had overly strict validation rules (it didn’t expect the simulator to use host:port instead of the actual IP).

But the process was different. When something broke, we knew where to look: the specs defined intended behavior, the tiles contained actual API contracts, and the constitution kept us honest about principles.

The Head First Java Moment

Viktor pushed back during the stream. “We spent an hour and ten minutes and still haven’t implemented anything.”

Fair point. Let me offer a parable.

In my all-time favorite technical book, called *Head First Java,* there is a chapter about object-oriented programming. It has a colorful competition between an OO developer and a procedural developer, racing to deliver a project. The ultimate prize: an Aeron chair. (The stakes were high in 2003.)

Who took the lead early? The procedural developer. Easy to bang out features when you don’t have to think about structure.

But as the project continued, more requirements, sometimes contradicting previous requirements, refactoring needed—who won?

(The boss’ secretary got the chair, actually. But that’s not the point.)

The analogy to our world: vibecoding is the procedural developer taking an early lead, while spec-driven development is the slower start that actually bridges the intent-to-code chasm.

Sure, we could have had a purple, hallucination-riddled, test-free app in thirty minutes, but it wouldn’t survive a context window reset, and you’d be debugging fictional API calls for the rest of the afternoon.

Which one do you want running in production?

Try It Yourself

The spec-kit integration with Tessl follows this order:

# First, figure out what you're building
/speckit constitution
/speckit specify
/speckit clarify
/speckit plan

# Now you know the tech stack. Ask the agent to find relevant tiles
"Using Tessl MCP, search for and install tiles for the tech stack in our plan"

# Revise the plan with actual documentation (don't run /plan again!)
"Review and correct the plan based on the documentation from the Tessl tiles"

The key insight: you can’t install tiles before you know what you’re building. Constitution, specification, and clarification come first. The initial plan determines your tech stack. Then you bring in the tiles and ask the agent to revise the plan.

It won’t prevent all mistakes, but it will make your agent’s failures more debuggable and its successes more reproducible.

What’s Next

Next up: I’ve implemented this entire spec-kit workflow as AI agent skills and packed them into a Tessl tile for distribution. No more CLI to install, and the adherence to process, the constitution checking, the artifact validation—it all got significantly more robust when you move from slash commands to executable skills. Skills are awesome!

That’s the next episode. Viktor will complain about enterprise complexity. I will make more Continental Congress jokes. We will embarrass ourselves in public while wearing metaphorical powdered wigs. Stay tuned.

*Baruch Sadogursky is a Developer Advocate at Tessl, where he helps developers stop vibecoding and start spec-driven development. Previously, he spent years at JFrog convincing people that artifact repositories matter. He was right about that, too.*