Reference tile for Themis, a Node.js and TypeScript unit test framework designed for AI coding agents. Covers unit-test authoring, Jest/Vitest migration, agent-readable failure output with repair hints, and first-class integrations for Claude Code, Cursor, and generic agents.
96
94%
Does it follow best practices?
Impact
97%
2.69xAverage score across 10 eval scenarios
Passed
No known issues
{
"context": "Tests whether the agent uses the --claude-code init flag (not just --agents) and correctly documents all four artifacts it installs: CLAUDE.md, the skill file, slash commands, and the optional PostToolUse hook.",
"type": "weighted_checklist",
"checklist": [
{
"name": "--claude-code flag used",
"description": "setup.sh contains `themis init --claude-code` (or `npx themis init --claude-code` / `bunx themis init --claude-code`) — NOT just `themis init --agents`",
"max_score": 20
},
{
"name": "CLAUDE.md mentioned in notes",
"description": "SETUP_NOTES.md lists `CLAUDE.md` as a file installed by the initialization at the repo root",
"max_score": 12
},
{
"name": "Skill file path mentioned",
"description": "SETUP_NOTES.md mentions `.claude/skills/themis/SKILL.md` (or `.claude/skills/themis/`) as an installed artifact",
"max_score": 12
},
{
"name": "Slash commands mentioned",
"description": "SETUP_NOTES.md mentions `.claude/commands/` or the slash commands `themis-test`, `themis-generate`, `themis-migrate`, `themis-fix`",
"max_score": 12
},
{
"name": "PostToolUse hook mentioned",
"description": "SETUP_NOTES.md mentions the `PostToolUse` hook or `scripts/claude-hook.js` as an optional component",
"max_score": 10
},
{
"name": "generate command in setup.sh",
"description": "setup.sh includes `themis generate src` or `themis generate src/` to create baseline tests",
"max_score": 10
},
{
"name": "Skill auto-load behavior described",
"description": "SETUP_NOTES.md explains that the skill file auto-loads when the user asks to write, run, fix, or migrate tests",
"max_score": 12
},
{
"name": "Hook feedback loop described",
"description": "SETUP_NOTES.md explains that the PostToolUse hook runs Themis after edits and feeds failures back into the conversation",
"max_score": 12
}
]
}evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10
rules
skills
themis