CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl-labs/audit-logs

Collect and normalize agent logs, discover installed verifiers, and dispatch LLM judges to evaluate adherence. Produces per-session verdicts and aggregated reports.

91

3.09x
Quality

90%

Does it follow best practices?

Impact

96%

3.09x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

verifier-schema.mdskills/create-verifiers/references/

Verifier Schema

A verifier is a JSON file containing one instruction and its checklist of things to evaluate. Each instruction produces one file. Judge agents use these to score agent sessions.

File Layout

Verifiers live in verifiers/ directories inside a tile. For tiles with skills, place verifiers inside the skill directory. For verifier-only tiles, place at the tile root.

Tile with skills (default):

my-tile/
  tile.json
  skills/
    my-skill/
      SKILL.md
      verifiers/
        use-tailwind-for-styling.json
        run-tests-after-changes.json

Verifier-only tile:

my-tile/
  tile.json
  docs/
    overview.md
  verifiers/
    prefer-bun-over-npm.json

The audit pipeline discovers verifiers anywhere in the tile tree — root, skill subdirectories, or any other location with a verifiers/ directory.

File names should be short kebab-case slugs derived from the instruction (e.g. use-tailwind-for-styling.json).

Schema

{
  "instruction": "Use Tailwind CSS for all styling",
  "relevant_when": "Agent is writing or modifying frontend React components",
  "context": "The project uses Tailwind v4 with the Vite plugin. Inline styles and CSS modules should be avoided. Tailwind classes go in className attributes on JSX elements.",
  "sources": [
    {
      "type": "file",
      "filename": "skills/frontend-design/SKILL.md",
      "tile": "anthropics/frontend-design@1.2.0",
      "line_no": 42
    }
  ],
  "checklist": [
    {
      "name": "tailwind-classes-used",
      "rule": "Agent uses Tailwind utility classes (className='...') when writing JSX/TSX components",
      "relevant_when": "Agent writes or modifies React component files"
    },
    {
      "name": "no-inline-styles",
      "rule": "Agent does not use inline style objects or style={{ }} attributes",
      "relevant_when": "Agent writes or modifies React component files"
    },
    {
      "name": "no-css-modules",
      "rule": "Agent does not create or import .module.css files",
      "relevant_when": "Agent creates new style files or imports for components"
    }
  ]
}

Field Reference

Top-level fields

FieldTypeRequiredDescription
instructionstringyesThe rule from the source material, stated positively and specifically
relevant_whenstringyesWhen this instruction applies at the session level. If the session has nothing to do with this scenario, the judge skips the entire instruction
contextstringyes2-3 sentences of background: definitions, applicability, edge cases. Helps the judge understand intent without reading the full source
sourcesarraynoWhere this instruction came from (see Sources below). Omit when verifier is embedded inside a skill directory — the source is implied by location. Include only for root-level or standalone verifiers.
checklistarrayyesOne or more checks to evaluate (see Checklist below)

Sources

Each source identifies where the instruction was found:

FieldTypeRequiredDescription
typestringyes"file" (from a document) or "user" (stated by user directly)
filenamestringwhen type=filePath to source file, relative to tile root
tilestringnoTile identifier if instruction came from an installed tile (e.g. "anthropics/frontend-design@1.2.0")
line_nointnoLine number in the source file

Checklist items

Each checklist item is one binary check the judge evaluates:

FieldTypeRequiredDescription
namestringyesShort identifier, 1-4 words, kebab-case. Must be unique within the file
rulestringyesWhat the agent should or shouldn't do. Binary and specific — a judge must be able to answer yes/no
relevant_whenstringyesWhen this specific check applies. Can be narrower than the instruction-level relevant_when

Writing Good Checklist Items

Rules should be binary

Avoid (needs interpretation)Use instead (binary)
"Properly handles errors""Uses try/catch around external API calls"
"Follows the import style""Local import paths use .ts extension, not .js"
"Good commit messages""Commit message contains more than 5 words"
"Creative layout""Uses at least ONE of: asymmetric grid, overlapping elements, rotated content"

When to split into multiple checklist items

Split when an instruction contains:

  • Multiple independent requirements: "use X library and configure Y setting" -> 2 items
  • Requirements at different scopes: "in tests do A; in production code do B" -> 2 items
  • "Do X, not Y" patterns: keep as one item ("Agent uses X, not Y") — this is one binary check

When to keep as one item

  • Requirements that are inherently coupled: "use the --frozen-lockfile flag with bun install"
  • Presence/absence of a single thing: "imports pdfplumber"

Coverage target

Each instruction should have 1-5 checklist items. A typical instruction has 1-3. If you're writing more than 5, you may be over-decomposing — consider whether some checks are really testing the same thing.

Examples

From a skill file

{
  "instruction": "Always run collect_logs.py before normalize_logs.py",
  "relevant_when": "Agent is running the log normalization pipeline",
  "context": "The normalization script expects raw logs to exist in the raw/ directory. Running it without collecting first will produce empty output or errors.",
  "sources": [
    {
      "type": "file",
      "filename": "skills/audit-skill/SKILL.md",
      "line_no": 85
    }
  ],
  "checklist": [
    {
      "name": "collect-before-normalize",
      "rule": "Agent runs collect_logs.py before running normalize_logs.py in the same session",
      "relevant_when": "Agent runs normalize_logs.py"
    }
  ]
}

From a CLAUDE.md rules file

{
  "instruction": "Use .ts extensions in local imports, not .js",
  "relevant_when": "Agent is writing or editing TypeScript files with local imports",
  "context": "The project uses modern TypeScript with native .ts resolution. Using .js extensions in imports is a legacy pattern that should be avoided.",
  "sources": [
    {
      "type": "file",
      "filename": "CLAUDE.md",
      "line_no": 28
    }
  ],
  "checklist": [
    {
      "name": "uses-ts-extension",
      "rule": "Local import paths in written or edited TypeScript files end in .ts",
      "relevant_when": "Agent writes or edits files containing local imports"
    },
    {
      "name": "no-js-extension",
      "rule": "No local import paths in written or edited TypeScript files end in .js",
      "relevant_when": "Agent writes or edits files containing local imports"
    }
  ]
}

From user input

{
  "instruction": "Always use bun, never npm or yarn",
  "relevant_when": "Agent runs package management commands",
  "context": "User preference for bun as the package manager. This applies to install, add, remove, and run commands.",
  "sources": [
    {
      "type": "user"
    }
  ],
  "checklist": [
    {
      "name": "uses-bun",
      "rule": "Agent uses bun (not npm or yarn) for package management commands like install, add, remove, run",
      "relevant_when": "Agent runs package management commands"
    }
  ]
}

tile.json