tessleng/skill-insights

Scan a directory or workspace for SKILL.md files across all agents and repos, capture supporting files (references, scripts, linked docs), dedupe vendored copies, enrich each Tessl tile with registry signals, and emit a canonical JSON inventory validated by JSON Schema. Then run four analytical phases in parallel against the inventory — staleness + git provenance (history, broken refs, contributors), quality (Tessl `skill review`), duplicates (similarity + LLM judgement), registry-search (per-standalone-skill registry suggestions, HTTP only) — and render a self-contained interactive HTML report with a top-of-report health overview, top-issues panel, recently-changed list, and per-tessl.json manifests view.

1.44x

Quality

90%

Does it follow best practices?

Impact

97%

1.44x

Average score across 2 eval scenarios

Securityby

Advisory

Suggest reviewing before use

name:: posthog-skill-query
description:: Fetch org-wide skill activations, install footprint, MCP-tool calls, and session aggregates from PostHog (the cli:agent-signals event stream emitted by the Tessl CLI) and emit a self-contained org_usage.json plus an interactive HTML report. Standalone — does not read discovery.json or any other phase output, and is not coupled to any particular repo. The output is raw counts only; no buckets, scores, or judgements are encoded in the JSON. Use when asked to pull org-wide skill usage from PostHog, audit cross-team skill adoption, see what skills users have loaded vs activated, look at tessl-MCP tool usage, find skills people are using that this repo doesn't have installed, render an interactive PostHog usage report, or produce the org_usage.json input for the cross-reference step of the skill-insights pipeline.

PostHog Skill Query

Name: tessleng/skill-insights
Rating: 84.71 (1 reviews)
Author: tessleng

Pulls org-wide skill / MCP / session telemetry into one self-contained org_usage.json (schema org-usage.schema.json, currently 1.4), and renders an interactive HTML report from it.

Standalone phase. Reads only from PostHog. Writes one JSON and one HTML. Has no awareness of discovery.json or any other phase's output, and produces only raw counts — no derived metrics, no shelf_warmer/active/silent buckets, no conversion ratios, no warnings list of judgement calls. Cross-reference and value-judgements are someone else's job downstream.

What "org" means here

PostHog project 57574 receives cli:agent-signals:* events from every Tessl CLI install with analytics on — Tessl employees, paying customers, OSS community users, anonymous CI runs. To slice that down to a meaningful "org", the script supports two independent filters that combine with OR:

Repo-prefix filter (--filter-repos, default github.com/tesslio) — matches properties.gitRepo (and properties.sessionGitRepo for session-processed events). Each prefix matches as exact-equal OR prefix/-prefix.
Email-domain filter (--filter-email-domains, default tessl.io) — matches person.properties.email as %@<domain> via the implicit persons join.

An event passes the filter if either matches. So the default — github.com/tesslio OR @tessl.io — captures both:

Tessl employees working in the canonical Tessl repos (caught by repo).
Tessl employees working in personal accounts, customer repos, or sessions outside any git checkout (caught by email).

To disable either half, pass an empty string: --filter-repos "" or --filter-email-domains "". Disabling both pulls every event in the project.

The filter applies to every query: activations, tile rollup, per-skill detail, untiled, loaded skills, MCP tools, and session aggregates. Around 30% of skill-activation events have no gitRepo set at all — those still pass via the email filter for authenticated Tessl employees. Anonymous unauthenticated events (~64% of the project's traffic) match neither filter and are excluded. The JSON's filter.events_per_window block reports per-window matched/excluded counts and a per-source split (events_matched_by_repo, events_matched_by_email) so coverage is visible.

What it queries

Three PostHog events, all emitted by the Tessl CLI's agent-signals sync loop:

Event	What it carries
`cli:agent-signals:skill-activation`	One event per skill activation, with `skillTile`, `skillName`, `provider`, `sessionId`, `signalTimestamp`, plus an `installedSkills[]` snapshot of the user's whole catalogue at activation time. Tile is null for non-Tessl skills (third-party / personal SKILL.md).
`cli:agent-signals:mcp-tool-activation`	One event per `mcp__tessl__*` invocation, with `tool` (prefix-stripped) and `provider`.
`cli:agent-signals:session-processed`	Per-session aggregates: `totalMessages`, `totalSkillCalls`, `tesslSkillCalls`, `tesslMcpCalls`, `tesslToolCalls`, etc.

For each configured day window (default 7,30,90) the skill issues:

Top-line totals — activations, users, sessions, tiled vs untiled split, providers.
Tile rollup — per-skillTile counts.
Per-skill detail — for the top N tiles (default 50) by primary-window activations. Includes per-provider activation counts and first/last-seen timestamps.
Untiled rollup — top N (default 100) skills with no skillTile.
Loaded skills — unrolled installedSkills[] arrays, counted by distinct user. Tells you who has each skill available, not who fires it.
MCP tools — every mcp__tessl__* tool with activation counts.
Session aggregates — sessions, users, summed message and tool counts.
Per-repo views (1.4+): repos[] (per-gitRepo totals) plus tiles_by_repo[], skills_by_repo[], untiled_skills_by_repo[], mcp_tools_by_repo[], and session_aggregates_by_repo mirror their non-repo counterparts with a gitRepo discriminator. Capped at top N repos (default 200) by primary-window activations. Powers the report's Repos section and the chip filter that lets the reader exclude repos client-side without re-querying PostHog.

Inputs (fetch_org_usage.py)

Flag	Default	Purpose
`--output`	required	Path to write `org_usage.json`
`--posthog-host`	`https://us.posthog.com`	PostHog host
`--posthog-project`	`57574`	Project ID — Tessl's prod project
`--posthog-key-file`	`~/.tessl/posthog/personal-api-key`	Personal API key file. `$POSTHOG_PERSONAL_API_KEY` takes precedence.
`--windows`	`7,30,90`	Comma-separated day windows
`--primary-window`	`30`	Window used for sort + provider breakdown
`--top-tiles-detail`	`50`	Per-skill detail fetched only for top N tiles
`--top-untiled`	`100`	Number of untiled skills to include
`--top-loaded`	`5000`	Max loaded-skill rows to fetch per window
`--top-repos`	`200`	Cap on distinct repos retained in `repos[]` and `*_by_repo[]` arrays. Long-tail repos beyond the cap are dropped from per-repo views; their events still feed the all-repos `totals`.
`--filter-repos`	`github.com/tesslio`	Comma-separated repo-prefix allowlist (matched on `properties.gitRepo`). Empty string disables this half.
`--filter-email-domains`	`tessl.io`	Comma-separated email-domain allowlist (matched on `person.properties.email` as `%@<domain>`). Leading `@` optional. Empty string disables this half.

The personal API key needs project:read and query:read scopes. Get one at https://us.posthog.com/settings/user-api-keys.

Run (fetch)

python3 <skill-dir>/scripts/fetch_org_usage.py \
  --output "$OUTPUT_DIR/org_usage.json" \
  --windows 7,30,90

Stdlib only (urllib, json); jsonschema is a soft dependency used for IO-contract validation when available. Wall time is dominated by PostHog query latency — typically 25–35 seconds end-to-end for the default three windows.

Render (render_org_usage.py)

python3 <skill-dir>/scripts/render_org_usage.py \
  --input  "$OUTPUT_DIR/org_usage.json" \
  --output "$OUTPUT_DIR/org_usage.html"

The renderer is a single string-substitution into org-usage-report-template.html. The template carries all the styling, sorting, filtering, and footer-caveat copy — the script is a thin wrapper around template.replace("", json).

The rendered report is fully self-contained — no external JS, the JSON is embedded in a <script type="application/json"> block, all interactivity is plain DOM.

Sync queries with backoff retry

The fetch script issues plain synchronous HogQL queries against /api/projects/<id>/query/. ClickHouse caps each query at 10s server-side, which our queries are tuned to stay under (per-skill detail is filtered to the top N tiles, untiled rollup is LIMIT 200, etc.).

Async polling was tried but ran into production overload (Queries are a little too busy right now) because the async pool queues project-wide. Sync skips that pool entirely.

On 502/503/504 the script retries with backoff (5s, 15s) before giving up.

Auth check

Before issuing analytical queries, the script makes one GET /api/projects/<id>/ call. If it returns 403 (most likely cause: missing scope), it exits with a clear error pointing at the settings page.

What "loaded" vs "activated" mean

The two are independent signals. Activated means a cli:agent-signals:skill-activation event fired for that (tile, name) — i.e. one of these specific paths produced a Skill tool event:

Modern Claude Code Skill tool invocation (canonical slash command + agent-organic both produce this).
Cursor IDE user-typed /foo slash command.
Cursor IDE agent calling read_file* against a .claude/skills/<name>/SKILL.md, .cursor/skills/<name>/SKILL.md, or .tessl/tiles/<ws>/<tile>/skills/<name>/SKILL.md path.

Loaded means the skill appeared in installedSkills[] on at least one activation event in the window — i.e. somebody had it in their catalogue at the time something fired.

The two have known gaps and you should treat them as facts about telemetry events, not as proxies for "skill is used / unused":

A skill that was used heavily through a path PostHog doesn't watch (e.g. Cursor @SKILL.md mention, raw Claude Code Read of SKILL.md, older slash-command flow, non-claude/non-cursor harnesses) won't appear in activations.
A user who has skills installed but never fires any activation event in the window contributes zero rows to loaded_skills — we only see catalogues for users who fire at least one event.

The HTML report's footer spells these caveats out for the reader. They are not encoded in the JSON.

Output shape (1.4)

Roughly (full schema in org-usage.schema.json):

{
  "schema_version": "1.4",
  "fetched_at": "2026-05-05T09:31:50Z",
  "source": { "kind": "posthog", "host": "...", "project_id": 57574,
              "dashboard_id_for_reference": 1358856 },
  "windows": [7, 30, 90],
  "primary_window_days": 30,
  "top_tiles_detail": 50,
  "top_untiled": 100,
  "top_loaded": 5000,
  "top_repos": 200,
  "tool_version": "skill-insights@0.11.0",
  "filter": {
    "repos": ["github.com/tesslio"],
    "email_domains": ["tessl.io"],
    "match_kind": "prefix-or-email-domain",
    "events_per_window": {
      "30d": {
        "events_total": 12634, "events_with_gitrepo": 9032,
        "events_no_gitrepo": 3602,
        "events_matched_filter": 1027,
        "events_matched_by_repo": 476,
        "events_matched_by_email": 1023,
        "events_excluded_by_filter": 11607
      }
    }
  },

  // All-repos rollups (unchanged from 1.3 — these are what the report shows
  // when no repo chip is unticked).
  "totals":   { "30d": { "activations": ..., "users": ..., "sessions": ...,
                          "tiled_activations": ..., "untiled_activations": ...,
                          "providers": { "claude-code": {...}, "cursor-ide": {...} } },
                "7d": {...}, "90d": {...} },
  "tiles":    [ { "tile": "tessleng/backend-prod-query",
                  "windows": { "30d": {"activations":13,"users":7,"sessions":8}, ... } } ],
  "skills":   [ { "tile": "...", "name": "...",
                  "windows": { "30d": {...}, "7d": {...}, "90d": {...} },
                  "providers": { "claude-code": 13, "cursor-ide": 0 },
                  "first_seen": "...", "last_seen": "..." } ],
  "untiled_skills": [ { "name": "first-principles-dialogue",
                        "windows": { "30d": {"activations":251,"users":1}, ... } } ],
  "loaded_skills":  [ { "tile": "tessleng/agent-insight-experiment", "name": "synthesize-insights",
                        "scope": "project",
                        "windows": { "30d": {"users":18}, ... } } ],
  "mcp_tools": [ { "tool": "query_library_docs",
                   "windows": { "30d": {"activations":580,"users":89,"sessions":...} } } ],
  "session_aggregates": { "30d": { "sessions": ..., "messages": ...,
                                   "tessl_skill_calls": ..., "tessl_mcp_calls": ...,
                                   "tessl_cli_calls": ... }, ... },

  // NEW in 1.4 — per-`gitRepo` views.
  // Events with a NULL gitRepo are dropped from these (their volume is in
  // filter.events_per_window.*.events_no_gitrepo). The renderer uses these
  // to power the chip filter and a Repos section, and re-aggregates the
  // all-repos rollups by SUBTRACTING the excluded repos' contributions.
  "repos": [ { "repo": "github.com/tesslio/monorepo",
               "windows": { "30d": { "activations": 263, "users": 18, "sessions": 149,
                                     "tiled_activations": 200, "untiled_activations": 63,
                                     "providers": { "claude-code": {...}, "cursor-ide": {...} } } } } ],
  "tiles_by_repo":  [ { "tile": "...", "repo": "...", "windows": {...} } ],
  "skills_by_repo": [ { "tile": "...", "name": "...", "repo": "...", "windows": {...} } ],
  "untiled_skills_by_repo": [ { "name": "...", "repo": "...", "windows": {...} } ],
  "mcp_tools_by_repo":      [ { "tool": "...", "repo": "...", "windows": {...} } ],
  "session_aggregates_by_repo": {
    "30d": [ { "repo": "...", "sessions": ..., "messages": ..., ... } ],
    "7d": [...], "90d": [...]
  }
}

loaded_skills deliberately stays repo-less — installedSkills[] is a snapshot of a user's whole catalogue at activation time, not a per-repo concept. The HTML report greys out the loaded-skills section when a repo filter is active.

No warnings field. No derived buckets or scores. The shape stays uniform whether the data is empty, partial, or complete.

Verify

jq -e '.schema_version == "1.4"'         "$OUTPUT_DIR/org_usage.json" > /dev/null
jq -e '.totals | has("30d")'             "$OUTPUT_DIR/org_usage.json" > /dev/null
jq -e '.filter | has("email_domains")'   "$OUTPUT_DIR/org_usage.json" > /dev/null
jq -e '.repos | type == "array"'         "$OUTPUT_DIR/org_usage.json" > /dev/null
jq -e '.tiles_by_repo | type == "array"' "$OUTPUT_DIR/org_usage.json" > /dev/null
[ -s "$OUTPUT_DIR/org_usage.html" ] && grep -q "Org Skill Usage" "$OUTPUT_DIR/org_usage.html"

Summary line (fetch)

Org usage fetched in <T>s.
  Filter:        repos=github.com/tesslio OR emails=@tessl.io
  30d slice:     <M> of <N> events selected (<X> via repo, <Y> via email; <Z> excluded; <W> had no gitRepo)
  Windows:       [7, 30, 90]
  Primary 30d:   <N> activations, <N> users, <N> sessions
  Tiles seen:    <N>
  Skills seen:   <N> (top 50 tiles, all windows)
  Untiled:       <N> (no skillTile attribution)
  Loaded skills: <N> (from installedSkills snapshots)
  MCP tools:     <N>
  Repos seen:    <N> (capped at top <top_repos>)
  By-repo rows:  <N> tile×repo, <N> skill×repo, <N> untiled×repo, <N> mcp×repo
  Output:        <path>

Failure modes

No API key (neither $POSTHOG_PERSONAL_API_KEY nor key file readable) → exits with a message pointing at the PostHog settings page.
Auth check returns 403 → exits with the body of PostHog's response so missing-scope errors are obvious.
A query exceeds the 10s ClickHouse cap or PostHog returns 5xx → the client retries with backoff (5s, then 15s); after that, the script exits with the response body inline. Re-running almost always succeeds (PostHog caches the result once the heavy scan finishes).
Schema validation fails on the assembled output → exits 2 (same convention as the other phases) so a malformed org_usage.json never reaches downstream consumers.
Renderer cannot find the placeholder in the template → exits with a clear error.

Standalone testability

Only needs the API key:

echo "phx_..." > ~/.tessl/posthog/personal-api-key && chmod 600 ~/.tessl/posthog/personal-api-key
python3 <skill-dir>/scripts/fetch_org_usage.py --output /tmp/org_usage.json
python3 <skill-dir>/scripts/render_org_usage.py --input /tmp/org_usage.json --output /tmp/org_usage.html
open /tmp/org_usage.html

No discovery, no other phase outputs, no scan context. The result is the same regardless of which repo you happen to be sitting in.