tessleng/skill-insights

Scan a directory or workspace for SKILL.md files across all agents and repos, capture supporting files (references, scripts, linked docs), dedupe vendored copies, enrich each Tessl tile with registry signals, and emit a canonical JSON inventory validated by JSON Schema. Then run four analytical phases in parallel against the inventory — staleness + git provenance (history, broken refs, contributors), quality (Tessl `skill review`), duplicates (similarity + LLM judgement), registry-search (per-standalone-skill registry suggestions, HTTP only) — and render a self-contained interactive HTML report with a top-of-report health overview, top-issues panel, recently-changed list, and per-tessl.json manifests view.

1.44x

Quality

90%

Does it follow best practices?

Impact

97%

1.44x

Average score across 2 eval scenarios

Securityby

Advisory

Suggest reviewing before use

name:: analyze-skill-staleness
description:: Compute per-skill staleness + git-provenance signals from a discovery.json — git mtime, days since last commit, broken file references (git-history backed), registry-update-available factor for installed Tessl tiles, a 0-100 staleness score per skill, and authorship/contributor data (created_by, last_modified_by, top contributors, recent commits). Fully deterministic, no LLM. One of three analytical phases that run in parallel after discovery. Use when asked to identify stale skills, find skills that haven't been touched in a while, surface skills referencing deleted files, or audit who created/last modified each skill.

Analyze Skill Staleness

Name: tessleng/skill-insights
Rating: 84.71 (1 reviews)
Author: tessleng

Compute deterministic staleness signals + git provenance (authorship) for every skill in a discovery.json. Output conforms to staleness.schema.json (currently schema_version: "1.1").

Fully programmatic — no LLM, no agent judgement. Single bundled script. The script validates its input (discovery.json) against discovery.schema.json and its output against staleness.schema.json at the IO boundary; a malformed input or output exits with code 2 rather than silently corrupting the run.

Inputs

--discovery <path> (required) — discovery.json produced by discover-skills (schema 1.3).
--output <path> — defaults to <dirname(discovery)>/staleness.json.

The script reads each repo's path from discovery.metadata.repos[].path and runs git log against the skill's all_paths, falling back through paths in priority order so vendored gitignored copies don't return empty histories.

Run

python3 <skill-dir>/scripts/analyze_staleness.py \
  --discovery "$DISCOVERY_PATH" \
  --output "$OUTPUT_PATH"

Stdlib + git on PATH; jsonschema is a soft dep used for IO contract validation (skipped with a warning otherwise). Runtime is dominated by git log calls — typically <1s per 100 skills (one git log per skill, all signals derived from a single stream).

Signals it computes

Per skill — all derived from a single git log --format=%H%x09%aI%x09%an%x09%ae%x09%s -- <path> per skill (with path-priority fallback). One subprocess per skill, all signals derived in-process.

Signal	How
`last_modified`, `first_seen`, `commit_count`	First / last / count of commit rows in the parsed `git log` stream
`tracked_path`	The actual git-tracked path that produced the history (may differ from `primary_path` for vendored skills)
`days_since_modified`, `days_since_first_seen`	Derived from `last_modified` / `first_seen`
`git_provenance.created_by`	Author (name + email) of the first commit row
`git_provenance.last_modified_by`	Author (name + email) of the most recent commit row
`git_provenance.contributors[]`	Distinct (name, email) pairs aggregated from commit rows, sorted by commit count, top 10
`git_provenance.recent_commits[]`	Most recent 5 commits (sha, ISO date, author, subject)
`broken_references`	Pulled from `discovery.warnings` (`broken link in <repo>/<path>: <target>`). Discovery's git-history-backed detection covers markdown links, `@imports`, and inline backticks
`tile_update_available`	True if `discovery.tiles[].outdated.update_available` for the owning materialised tile instance
`tile_current_version`, `tile_latest_version`, `tile_last_scored_at`	Pulled from `discovery.tiles[]` enrichment
`tier`	Stamped from discovery
`staleness_score` (0-100)	See scoring below
`staleness_bucket`	`fresh` / `warm` / `stale` / `ancient` / `unknown`
`factors[]`	Qualitative tags explaining the score (e.g. `older_than_180_days`, `broken_references`, `registry_update_available`)

Scoring

score = 0
+ 5  if days_since_modified > 30
+ 15 if days_since_modified > 90
+ 25 if days_since_modified > 180
+ 25 if days_since_modified > 365
+ 10 per broken reference (capped at +30)
+ 5  if days_since_modified > 90 AND repo median > 180
+ 10 if commit_count == 1 (never updated since first commit)
+ 15 if owning tile has a registry update available
+ 20 if no git history at all
clamp to [0, 100]

Buckets:

fresh: days_since_modified < 30 and score < 20
warm: days_since_modified < 90 and score < 40
stale: 90 ≤ days_since_modified ≤ 365 or 40 ≤ score ≤ 70
ancient: days_since_modified > 365 or score >= 70
unknown: no git history

Verify

jq -e '.schema_version == "1.1"' "$OUTPUT_PATH" > /dev/null

per_skill[].length should match discovery.skills.length.

Summary line

Staleness analysis complete.
  Skills:        <N>
  Median age:    <D> days
  Broken refs:   <N> skills affected
  Buckets:       fresh=<N>, warm=<N>, stale=<N>, ancient=<N>, unknown=<N>
  Top offender:  <skill_id> (score <S>)
  Output:        <path>

When git or registry data is unavailable

The script degrades gracefully. Any of these are non-fatal:

Repo isn't a git checkout → last_modified etc. are null, no_git_history factor added, +20 score penalty
Discovery was produced without registry enrichment → tile_update_available stays false, no registry-update factor
Discovery has tiles[] but the registry call wasn't made (no auth) → same; tile_update_available is false

Standalone testability

The phase needs only discovery.json:

python3 <skill-dir>/scripts/analyze_staleness.py \
  --discovery /path/to/.skill-insights/discovery.json \
  --output /tmp/staleness.json

No other phases need to have run.