CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl-labs/audit-logs

Collect and normalize agent logs, discover installed verifiers, and dispatch LLM judges to evaluate adherence. Produces per-session verdicts and aggregated reports.

91

3.09x
Quality

90%

Does it follow best practices?

Impact

96%

3.09x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

summary-formats.mdskills/audit-logs/references/

Summary Formats

Read this when presenting audit results to the user.

Phase 1 Summary (single session)

Focus on what happened and what to fix. Do NOT use percentages or aggregate stats — this is one session. Group checks by tile so the user knows where each rule comes from.

When everything passes:

## Session Check — claude-code/abc123

  ✅ All checks passed — your agent followed the rules.

  ### elevenlabs/text-to-speech — ✅ all passing
  ✓ activated skill when needed
  ✓ no-hardcoded-key, correct-header, correct-js-package, valid-setting-ranges

  ### anthropics/frontend-design — ✅ all passing
  ✓ activated skill when needed
  ✓ use-tailwind-for-styling, no-inline-styles, dev-server-before-screenshot, ts-extensions

  Everything looks good. Want to check more sessions or add verifiers for other behaviors?

When there are failures:

## Session Check — claude-code/abc123

  ### elevenlabs/text-to-speech — ✅ all passing
  ✓ activated skill when needed
  ✓ no-hardcoded-key, correct-header, correct-js-package, valid-setting-ranges

  ### amyh/frontend-design — ⚠ 2 issues
  ✗ skill not activated
  ✓ tailwind-classes-used, no-css-modules, ts-extensions

  1. **no-inline-styles** — FAILED
     Rule: "Use Tailwind utility classes, never inline styles"
     What happened: Agent used style={{}} on 3 components (turns 12, 18, 24)
     → Fix: Replace inline styles with Tailwind classes in these files

  2. **dev-server-before-screenshot** — FAILED
     Rule: "Start dev server before taking screenshots"
     What happened: Agent ran screenshot tool at turn 31 without starting dev server
     → Fix: Run `bun dev` and retake the screenshot

Always explain what the rule says vs what actually happened. Offer to fix each issue.

When friction is enabled, add after the verifier results:

### Friction Points

  **amyh/frontend-design**
    Preventable (skill has the answer, agent didn't follow):
    - **repeated_failure** (turns 15-22, major) — Agent tried 3x to screenshot without dev server
      → dev-server-before-screenshot verifier also failed. Skill says to do this.

    Introduced (skill instructions caused the problem):
    - **wrong_approach** (turn 8, moderate) — Skill says use CSS modules but project uses Tailwind
      → Fix: update the skill's styling instructions

  **Unrelated**
    - **tool_misuse** (turn 32, minor) — Wrong file path in Read tool, resolved in 1 turn

Phase 2 Summary (across sessions)

Start with a tile-level overview (when 2+ tiles), then per-tile details. Lead with what's working.

## Cross-Session Analysis (12 sessions, last 7 days)

  ### Overview
  ✅ elevenlabs/text-to-speech — activated (1/3 sessions), all checks passing
  ⚠ amyh/frontend-design — never activated, 2 recurring issues
  ➖ amyh/webapp-testing — not relevant to recent sessions

  ### elevenlabs/text-to-speech — ✅ all passing
  ✓ no-hardcoded-key — 100% (4/4 applicable)
  ✓ correct-header — 100% (4/4)
  ✓ correct-js-package — 100% (2/2)
  ✓ valid-setting-ranges — 100% (6/6)

  ### amyh/frontend-design — ⚠ 2 recurring issues
  ✗ never activated 
  
  Consistently passing:
  ✓ tailwind-classes-used — 95% (10/10 applicable)
  ✓ ts-extensions — 100% (12/12)
  ✓ no-css-modules — 100% (8/8)

  Improving:
  ✓ no-inline-styles — 80% now (was 60% last week) — trending up

  Recurring issues:
  1. **dev-server-before-screenshot** — 40% pass rate (2/5 applicable sessions)
     Pattern: Agent consistently skips starting dev server before screenshots
     → This tile is installed locally — review its SKILL.md to clarify the
       instruction, or add to AGENTS.md as a backup.

  2. **color-tokens-only** — 60% pass rate (3/5 applicable sessions)
     Pattern: Agent uses raw hex values instead of design tokens
     → Run `tessl skill review tiles/frontend-design --optimize` to
       tighten the instruction wording.

When friction is enabled, extend the tile sections:

### amyh/frontend-design — ⚠ 2 failing checks + friction

  Adherence: 75% pass rate (dev-server-before-screenshot: 40%)

  Friction breakdown (4 events across 3 sessions):
    Preventable × 2 — agent not following existing instructions
    Introduced × 1 — skill's styling instruction conflicts with project setup
    Adjacent × 1 — responsive layout struggles not covered by skill

  → Priority: fix the introduced friction first (wrong instruction),
    then strengthen activation for preventable issues

tile.json