tessleng/skill-insights

Scan a directory or workspace for SKILL.md files across all agents and repos, capture supporting files (references, scripts, linked docs), dedupe vendored copies, enrich each Tessl tile with registry signals, and emit a canonical JSON inventory validated by JSON Schema. Then run four analytical phases in parallel against the inventory — staleness + git provenance (history, broken refs, contributors), quality (Tessl `skill review`), duplicates (similarity + LLM judgement), registry-search (per-standalone-skill registry suggestions, HTTP only) — and render a self-contained interactive HTML report with a top-of-report health overview, top-issues panel, recently-changed list, and per-tessl.json manifests view.

1.44x

Quality

90%

Does it follow best practices?

Impact

97%

1.44x

Average score across 2 eval scenarios

Securityby

Advisory

Suggest reviewing before use

name:: discover-skills
description:: Discover every SKILL.md file across one or more repositories, capture supporting files (references/, scripts/, linked docs, inline-backtick paths), dedupe vendored copies, and enrich each Tessl tile with registry data (quality, security, uplift, eval scores), outdated check, and context-cost analysis. Emits a canonical JSON inventory. Use when asked to inventory skills across a repo or workspace, build a discovery.json, or as the first phase of a skill-estate pipeline.

Discover Skills

Name: tessleng/skill-insights
Rating: 84.71 (1 reviews)
Author: tessleng

Produce a canonical inventory of every SKILL.md file under the scan target, plus enrichment for every Tessl tile found. Deterministic task — no LLM judgement.

Output conforms to discovery.schema.json (currently schema_version: "1.4"). The script validates its output against this schema before writing, so a malformed output will exit with code 2 rather than silently corrupting downstream phases. See references/schemas/ for all phase contracts and the shared validator.

Inputs

SCAN_ROOT — directory to scan. Defaults to $(pwd). Used both as the auto-discovery root and as the default output base.
OUTPUT_PATH — where to write the JSON. Defaults to $SCAN_ROOT/.skill-insights/discovery.json.
--repo PATH (repeatable) — explicitly select repos to include. When provided, replaces auto-discovery entirely — only the listed repos are scanned. Each path can be absolute or relative to --scan-root.

Run the bundled script

# Auto-discovery: scan a single repo, OR a workspace of git children
python3 <skill-dir>/scripts/discover_skills.py \
  --scan-root "$SCAN_ROOT" \
  --output "$OUTPUT_PATH"

# Explicit selection: pick exactly which repos to include
python3 <skill-dir>/scripts/discover_skills.py \
  --scan-root "$WORKSPACE" \
  --repo "$WORKSPACE/monorepo" \
  --repo "$WORKSPACE/lightdash" \
  --output "$OUTPUT_PATH"

<skill-dir> resolves to $HOME/.tessl/tiles/tessleng/skill-insights/skills/discover-skills/ for a global install, or $(pwd)/.tessl/tiles/tessleng/skill-insights/skills/discover-skills/ for a per-repo install. The script also accepts SCAN_ROOT, OUTPUT_PATH, and SCAN_ID as environment variables if flags aren't passed.

Repo selection modes

Mode	Trigger	Behaviour
Single repo	`--scan-root` is a git repo (no `--repo`)	Just that one repo
Workspace auto	`--scan-root` is a parent dir with git-repo children (no `--repo`)	Every immediate `.git`-having child
Single non-git	`--scan-root` is a directory with no `.git` and no git children (no `--repo`)	Treated as a single non-git "repo"
Explicit selection	One or more `--repo` flags	Exactly the listed repos; ignores any other children of `--scan-root`

Use --repo when you want to scan a curated subset of a workspace (e.g. only monorepo + lightdash, skipping unrelated repos in the same parent dir).

What the script does

Stdlib-only by default, with two soft dependencies: PyYAML for richer frontmatter parsing (regex fallback otherwise), and jsonschema for IO contract validation against discovery.schema.json (no validation otherwise — single stderr warning).

Per skill (deterministic, fast):

Glob discovery for SKILL.md files
Symlink-following walk with per-chain cycle detection (essential for Tessl-vendored installs that symlink .claude/skills/tessl__* → .tessl/tiles/...)
Content hashing (SHA-256 of full file bytes) → vendored dedup within a repo
YAML frontmatter parsing (best-effort, surfaces parse errors to warnings)
Supporting-file capture: references/ / reference/ dirs, scripts/ dirs, markdown links, @imports, inline backtick paths in body
Bundled-directory summarisation (other sibling dirs)
Source-type classification (origin-based: a vendored tile skill stays tessl_tile_skill even when surfaced via .claude/skills/)
Owning-package detection by walking up for tile.json / .claude-plugin/plugin.json
Tier stamping per skill: published_tile / authored_tile / github_tile / claude_plugin / non_tile

.tessl/ exclusion: SKILL.md files inside .tessl/ (the Tessl CLI's installed-tile cache) are not included in skills[]. Those tiles are surfaced via tiles[] instead, with source: "tessl_json". This keeps the skill inventory focused on first-party content the user authored.

Per tile (Tessl-aware enrichment, requires tessl CLI + auth):

Discovery source: every tile entry carries a source field of "tessl_json" or "filesystem". Tiles found via tessl.json declarations are tagged tessl_json; tiles discovered by walking the SKILL.md tree (e.g. authored tiles under tiles/<name>/) are tagged filesystem. When a tile is found via both methods, tessl_json wins.
Tier classification by inspecting tessl.json source field and tile.json location
Registry call to GET /v1/tiles/{ws}/{name}/versions/{ver} for every tile that resolves on the registry — pulls aggregate / quality / impact / security / uplift / eval scores / moderation / archived / fingerprint
One tessl outdated --json call per scanned repo, run from that repo, mapped per tile to detect "newer version available on registry"
One tessl tile lint per tile parsed into front-loaded + on-demand token totals (and per-skill breakdown)

If the Tessl auth file is missing (~/.tessl/api-credentials.json), registry/outdated enrichment is silently skipped — the scan still produces a valid inventory, just without registry-derived signals. tessl tile lint still runs (it's local).

Broken-reference detection (git-history backed):

For every path-shaped reference in a SKILL.md body — markdown links, @imports, inline backticks — the script:

Resolves the target to a repo-relative path
Checks git ls-tree HEAD (currently tracked)
Checks git log --all --name-only (ever tracked)
If currently tracked → records as a supporting_files[] entry
If ever tracked but no longer present → emits a broken link in <repo>/<skill_path>: <target> warning
If never tracked → silently ignored (it's a code symbol, package name, or external reference, not a repo file)

No extension allowlist. No false positives from prose mentioning paths that aren't from this repo.

Verify + summarise

After the script returns:

Check exit code: 0 means success. Non-zero means the scan failed — surface stderr to the user.
The script self-validates its output against discovery.schema.json before writing (when jsonschema is installed). For external sanity-check the version field can still be eyeballed:
```
jq -e '.schema_version == "1.4"' "$OUTPUT_PATH" > /dev/null
```

Print a concise summary from the JSON:

Skill Discovery complete (schema 1.4).
Scan root:    <scan_root>
Repos:        <N>
Skills:       <N> logical (<M> paths)
Tiles:        <N> total — <published_to_registry> in registry, <authored_only> authored only
By source:    tessl_json=<N>, filesystem=<N>
By tier:      published=<N>, authored=<N>, github=<N>
Security:     <N> tiles flagged MEDIUM/HIGH/CRITICAL
Updates:      <N> tiles outdated vs registry
Broken refs:  <N> warnings

Requirements

Requirement	Why	What happens if missing
Python 3	Run the script	Script can't run; fix Python first
`git` on PATH	Broken-ref detection, mtime/age tracking	No broken-ref signal; ages still work via filesystem if `.git/` is present
`tessl` CLI on PATH	Outdated check + tile-lint context cost	Those enrichment fields will be missing on the tiles
`~/.tessl/api-credentials.json` (auth)	Registry enrichment per tile	`published_to_registry: null`, no `registry` block, no security/uplift/quality signals from the registry; falls back to local-only
`jsonschema` Python package	Output contract validation against `discovery.schema.json`	Single stderr warning; output written without validation (stdlib-only fallback)

The script degrades gracefully — every step is best-effort and missing capabilities surface as null in the output rather than aborting the scan.