CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl-labs/audit-logs

Collect and normalize agent logs, discover installed verifiers, and dispatch LLM judges to evaluate adherence. Produces per-session verdicts and aggregated reports.

91

3.09x
Quality

90%

Does it follow best practices?

Impact

96%

3.09x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

SKILL.mdskills/create-verifiers/

name:
create-verifiers
description:
Create structured verifiers (eval rubrics) from skills, docs, rules, or any instruction source. Produces checklist-based grading criteria that LLM judges can score against agent session logs. Use when you want to define tests for agent behavior, build assessment criteria, validate compliance, or check quality of agent responses.

Create Verifiers

Create verifiers — structured pass/fail checklists that track any aspect of agent behavior you care about. Verifiers can come from:

  • Skills — extract rules from SKILL.md files in installed tiles
  • Docs and rules — extract from CLAUDE.md, AGENTS.md, .cursor/rules/, or any instruction file
  • User input — the user describes what they care about, you turn it into verifiers

Each verifier captures one instruction with a checklist of binary checks that an LLM judge evaluates against session transcripts.

Output

Verifier JSON files go in a directory called verifiers/. This directory is always flat (no nesting inside it). There are two valid locations:

  1. Tile root — for general verifiers, or verifier-only tiles
  2. Inside a skill directory — adjacent to a SKILL.md, for verifiers specific to that skill

Adding verifiers to an existing tile with skills

Put the verifiers/ directory inside the skill directory the verifiers apply to:

target-tile/
  tile.json
  skills/
    frontend-design/
      SKILL.md
      verifiers/                    # adjacent to SKILL.md
        use-tailwind-for-styling.json
        run-tests-after-changes.json
    webapp-testing/
      SKILL.md
      verifiers/                    # each skill gets its own
        use-playwright.json

If verifiers apply to the tile generally (not a specific skill), put them at the tile root.

Creating a new verifier-only tile

When creating a new tile just to hold verifiers, there are two approaches:

One tile per skill — cleaner traceability, put verifiers/ at the tile root:

tiles/frontend-design-verifiers/
  tile.json
  docs/
    overview.md
  verifiers/
    use-tailwind-for-styling.json
    run-tests-after-changes.json

One tile for multiple skills — all verifiers in a single verifiers/ directory, use a naming convention to trace back which skill they apply to (e.g. prefix with the skill name):

tiles/my-project-verifiers/
  tile.json
  docs/
    overview.md
  verifiers/
    frontend-design--use-tailwind.json
    frontend-design--run-tests.json
    webapp-testing--use-playwright.json

Either approach works. One-tile-per-skill is easier to manage if verifiers change at different rates. A single tile is simpler if there are only a few verifiers across skills.

See verifier-schema.md for the full JSON schema and examples.

Workflow

Step 1: Identify Source Material

Ask the user what to create verifiers from, or identify it from context. Sources can be:

  • A tile with skills — read tile.json to find skills, then read each SKILL.md and its references
  • Instruction filesCLAUDE.md, AGENTS.md, .cursor/rules/, or any file the user points to
  • User description — the user tells you what they want to track

Also identify the output target. This depends on how the tile is installed — check tessl.json to determine:

IMPORTANT: Never write verifiers into .tessl/tiles/ — that directory is tessl-managed and will be overwritten on install/update.

If the tile has "source": "file:..." in tessl.json — it's a local tile. The file path points to the editable source directory. Write verifiers there:

  • Put verifiers/ inside each skill's directory (adjacent to SKILL.md), or at the tile root for general verifiers
  • Changes are picked up automatically if --watch-local is active, or re-run tessl install file:<path>

If the tile has only "version": (no file: source) — it's from the registry or git. The source is read-only. Create a new companion tile to hold verifiers:

# Create a new tile alongside the project (not in .tessl/)
tessl tile new --name <workspace>/<tile-name>-verifiers --path tiles/<tile-name>-verifiers --workspace <workspace>

# Install it so tessl tracks it
tessl install file:tiles/<tile-name>-verifiers --watch-local

Decide with the user: one companion tile per source tile, or one combined verifier tile with a naming convention prefix.

If the user specifies a different path — use that instead.

Verifier-only tiles: If the target tile will only contain verifiers (no skills, rules, or other content), it must also include a short docs/ file to pass tessl tile lint. Create docs/overview.md with a brief description of what the verifiers check and note that they can be applied with the audit-logs skill. Add "docs": "docs/overview.md" to tile.json. The overview must include a markdown link to every verifier JSON file (e.g. [Use uv for Python](../verifiers/use-uv-for-python.json)) so that all verifiers are discoverable from the docs.

Tiles with existing content: If the target tile already has skills, docs, or other content, every verifier must still be linked via a markdown link from somewhere reachable from tile.json (e.g. a docs file, SKILL.md, or a references file). Choose the most natural place — for example, a "Verifiers" section in the existing docs file, or a dedicated docs/verifiers.md if the existing docs are focused on something else. Use your judgement on where fits best, but every verifier JSON must be linked.

Step 2: Read Source Content

Read extraction-guide.md for what to extract and field guidance.

Read all source material thoroughly:

  • For skills: read every SKILL.md and all linked references (references/, scripts/)
  • For docs/rules: read the full file
  • For user input: confirm understanding of what they want tracked

For each source, identify every instruction that directs an agent to do or not do something specific.

Step 3: Create Instruction Files (Phase 1)

For each instruction found, create a verifier JSON file with:

  • instruction, relevant_when, context filled in
  • sources filled in only if the verifier is at the tile root or in a standalone verifier-only tile. When verifiers are embedded inside a skill directory (e.g. skills/my-skill/verifiers/), omit sources — the source is implicitly the skill the verifier lives inside. This avoids sources drifting out of sync with the actual location.
  • checklist: [] (empty — filled in Phase 2)

See verifier-schema.md for the schema.

File naming: short kebab-case slug from the instruction (e.g. use-tailwind-for-styling.json).

For skill sources: also create an activation verifier if appropriate — "was the skill loaded?" See the activation section in extraction-guide.md.

For docs/rules sources: skip activation verifiers (docs are loaded automatically).

Do NOT pause here — proceed directly to filling out checklists.

Step 4: Fill Out Checklists (Phase 2)

For each instruction file, decompose into checklist items following extraction-guide.md.

Each checklist item needs:

  • name — short identifier (1-4 words, kebab-case)
  • rule — binary pass/fail check a judge can evaluate
  • relevant_when — when this specific check applies

After filling out all checklists, run validation:

uv run python3 scripts/validate_verifiers.py <verifiers-dir>

Fix any errors and re-run until clean.

Step 5: Review With User

Now pause and present the full set of verifiers for review. Ask the user to think critically about:

  1. Are all of these important? Some extracted rules may be minor or obvious — remove any the user doesn't care about tracking.
  2. Are the assumptions about when rules apply correct? Check that relevant_when fields match the user's actual workflow. A rule might say "when writing React components" but the user's agents rarely do that.
  3. Are the checklist items the right granularity? Too fine-grained means noisy results; too coarse means the judge can't give useful feedback.

Present the list clearly:

## Verifiers created — please review

Source: skills/frontend-design/SKILL.md (14 verifiers, 23 checklist items)

  1. use-tailwind-for-styling.json — "Use Tailwind CSS for all styling" (3 checks)
  2. run-dev-server-first.json — "Start dev server before screenshots" (1 check)
  3. prefer-shadcn-components.json — "Use shadcn/ui over custom" (2 checks)
  ...

Source: CLAUDE.md (8 verifiers, 11 checklist items)

  1. ts-extensions-in-imports.json — "Use .ts extensions in imports" (2 checks)
  2. run-lint-after-changes.json — "Run bun lint after changes" (1 check)
  ...

Total: 22 verifiers, 34 checklist items.

Are these all worth tracking? Should any be removed or modified?
Think about: which of these actually matter for your workflow,
and whether the "relevant_when" conditions match when your agents
actually encounter these situations.

Wait for user confirmation. Remove or adjust verifiers based on feedback before proceeding.

Step 6: Lint the Tile

Run tessl tile lint on the target tile to verify the tile structure is valid:

tessl tile lint <tile-path>

Fix any errors and re-run until clean.

Step 7: Install the Tile

If a new tile was created (not adding verifiers to an existing one), offer to install it so the audit pipeline can discover the verifiers:

tessl install file:<tile-path> --watch-local

--watch-local keeps the installed copy in sync as verifiers are added or edited. Without it, changes require re-running tessl install.

Step 8: Coverage Summary

Present a final summary:

Source: skills/frontend-design/SKILL.md
  Instructions: 12 (2 removed during review)
  Checklist items: 19
  Coverage: 12/14 extracted (2 removed by user)

Total: 18 verifiers, 28 checklist items

Reference Files

FileRead before
extraction-guide.mdStep 2: reading source material
verifier-schema.mdStep 3: creating verifier files
validate_verifiers.pyStep 4: after writing verifiers
tessl tile lintStep 6: after validation passes

tile.json