CtrlK
BlogDocsLog inGet started
Tessl Logo

tessleng/skill-insights

Scan a directory or workspace for SKILL.md files across all agents and repos, capture supporting files (references, scripts, linked docs), dedupe vendored copies, enrich each Tessl tile with registry signals, and emit a canonical JSON inventory validated by JSON Schema. Then run four analytical phases in parallel against the inventory — staleness + git provenance (history, broken refs, contributors), quality (Tessl `skill review`), duplicates (similarity + LLM judgement), registry-search (per-standalone-skill registry suggestions, HTTP only) — and render a self-contained interactive HTML report with a top-of-report health overview, top-issues panel, recently-changed list, and per-tessl.json manifests view.

84

1.44x
Quality

90%

Does it follow best practices?

Impact

97%

1.44x

Average score across 2 eval scenarios

SecuritybySnyk

Advisory

Suggest reviewing before use

Overview
Quality
Evals
Security
Files

SKILL.mdskills/analyze-skill-quality/

name:
analyze-skill-quality
description:
Score every skill in a discovery.json using `tessl skill review --json` (Tessl's canonical rubric — validation checks + description judge + content judge + 0-100 review score). Pulls tile-level quality directly from registry data already in discovery for tiles scored on the registry. One of three analytical phases that run in parallel after discovery. Use when asked to assess skill quality, find low-quality skills, or surface description / activation issues.

Analyze Skill Quality

Score each skill in discovery.json using tessl skill review. Output conforms to quality.schema.json (currently schema_version: "2.0"). The script validates input (discovery.json) and output at the IO boundary; malformed inputs/outputs exit with code 2.

CLI-driven phase. Single bundled script does everything: it iterates skills, shells out to tessl skill review --json in parallel batches, pulls tile-level quality from discovery.tiles[].registry.scores.quality where present, and writes quality.json.

No subagents. No custom rubric. No prompt files on disk. The Tessl backend provides the canonical quality assessment via the same review pipeline that powers tessl skill publish gating.

Inputs

  • A discovery.json produced by discover-skills.
  • Optional --max-skills N to cap the number reviewed (fast iteration).
  • Optional --concurrency N (default 8) for parallel tessl skill review calls.
  • Optional --skip-published-skills — skip the per-skill review for skills whose owning tile already has registry.scores.quality. Uses the tile-level score as a passthrough. Cheaper but loses per-skill detail in published tiles.

Run

python3 <skill-dir>/scripts/analyze_quality.py \
  --discovery "$DISCOVERY_PATH" \
  --output "$OUTPUT_PATH"        # default: <dirname(discovery)>/quality.json

The script:

  1. Reads every skill from discovery.json. For each skill, builds an absolute path = <repo.path>/<skill.primary_path>.
  2. (Optional) If --skip-published-skills is set, skips skills whose owning tile already has registry.scores.quality and stamps the tile-level score as a passthrough verdict.
  3. Spawns up to --concurrency parallel tessl skill review --json <abs_path> invocations using asyncio + subprocess.
  4. Parses each JSON response into a normalized per-skill record (review score, verdict band, validation results, description + content judge breakdowns, suggestions). Skills that fail or are skipped by --max-skills still get a per_skill[] row with verdict: "unknown" and _status.
  5. Builds a per_tile[] rollup. Each tile's score comes from registry.scores.quality if available; otherwise from the mean of the tile's per-skill review scores.
  6. Computes the estate summary (avg score, by_verdict counts, validation-failure count, source-attribution counts).
  7. Writes quality.json.

Cost characteristics

tessl skill review calls go through Tessl's /experimental/skills/review endpoint, which routes the LLM judges through LiteLLM. LiteLLM caches by (prompt, model) for 24 hours in production, so repeat scans on unchanged content are essentially free. First-time scans on a fresh estate pay full LLM cost on the backend (~12s per skill, parallelisable).

For a 72-skill repo: ~2 min wall-clock on first scan with concurrency 8, sub-second on cached re-runs.

Verify

jq -e '.schema_version == "2.0"' "$OUTPUT_PATH" > /dev/null

Sanity-check: every per_skill[].skill_id should appear in the source discovery.json.

Standalone testability

  • Run in isolation: pass a discovery.json and an output path. Stand-alone Python; no orchestrator required.
  • Spike one skill: --max-skills 1 runs the review for just the first skill. ~12s cold, sub-second cached.
  • Replay: keep an old discovery.json and re-run quality against it as the rubric / CLI evolves.

Requirements

  • tessl CLI on PATH (typically ~/.local/bin/tessl).
  • An authenticated Tessl session (tessl whoami should succeed). Anonymous review works but --model overrides require auth.
  • Network access to https://api.tessl.io.
  • jsonschema Python package (soft dep) — IO contract validation against discovery.schema.json and quality.schema.json. Skipped with a single stderr warning if missing.

If tessl isn't found, every skill gets a per_skill[] row with _status: "failed" and _error: "\tessl` CLI not found in PATH", and metadata.failed_skills[]` lists the same failures. The orchestrator surfaces this clearly so the user knows to install/log in.

skills

analyze-skill-quality

README.md

tile.json