CtrlK
BlogDocsLog inGet started
Tessl Logo

honeybadge/harbor

Framework for AI agent evaluation in containerized environments. Use when: (1) Running agent evaluations with `harbor run` against benchmarks (SWE-Bench, Terminal-Bench, Aider Polyglot, etc.), (2) Creating custom benchmark tasks with Dockerfile, instruction.md, solution, and tests, (3) Building adapters to convert existing benchmarks to Harbor format, (4) Implementing custom agents extending BaseAgent or BaseInstalledAgent, (5) Scaling evaluations to cloud providers (Daytona, Modal, E2B), (6) Exporting traces for RL/SFT training, (7) Debugging Harbor runs or inspecting package internals.

99

Does it follow best practices?

Validation for skill structure

Overview
Skills
Evals
Files

tile.json

{
  "name": "honeybadge/harbor",
  "version": "0.1.0",
  "private": false,
  "summary": "Framework for AI agent evaluation in containerized environments. Use when: (1) Running agent evaluations with `harbor run` against benchmarks (SWE-Bench, Terminal-Bench, Aider Polyglot, etc.), (2) Creating custom benchmark tasks with Dockerfile, instruction.md, solution, and tests, (3) Building adapters to convert existing benchmarks to Harbor format, (4) Implementing custom agents extending BaseAgent or BaseInstalledAgent, (5) Scaling evaluations to cloud providers (Daytona, Modal, E2B), (6) Exporting traces for RL/SFT training, (7) Debugging Harbor runs or inspecting package internals.",
  "skills": {
    "harbor": {
      "path": "SKILL.md"
    }
  }
}

Install with Tessl CLI

npx tessl i honeybadge/harbor@0.1.0

SKILL.md

tile.json